HBM:基于双向蒙特卡洛树和混合贪心策略的序列反转求解方法
HBM: A Sequence Reversal Solving Method Based on Bidirectional Monte Carlo Tree and Hybrid Greedy Strategy
摘要: 基因序列重排是计算生物学中的经典组合优化问题,目标是以最少的子串反转操作将给定序列转化为目标序列,其在解析基因组进化机制方面具有重要意义。然而,该问题具有NP难特性,求解极具挑战。为提高求解效率与质量,本文提出一种基于双向蒙特卡洛树与混合贪心策略的序列反转求解模型(HBM)。该模型首先通过预处理去除已匹配的前后缀,压缩问题规模;进而采用分层递进求解框架,依次利用双向广度优先搜索获取初始路径,在未获解时启用混合贪心策略,融合五种启发式指标动态选择局部操作,最终通过双向蒙特卡洛树搜索平衡探索与利用,以获取高质量解。在特定DNA序列数据集上的实验表明,HBM在反转路径长度、时间–质量性价比及解质量稳定性方面均优于现有方法。
Abstract: Gene sequence rearrangement is a classic combinatorial optimization problem in computational biology, which aims to transform a given sequence into a target sequence with the minimum number of substring reversal operations and plays a crucial role in deciphering the mechanisms of genome evolution. However, this problem is characterized by NP-hardness, making its solution extremely challenging. To improve the efficiency and quality of the solution, this paper proposes a sequence reversal solving model (HBM) based on a bidirectional Monte Carlo tree and a hybrid greedy strategy. The model first compresses the problem scale by preprocessing to remove matched prefixes and suffixes; then adopts a hierarchical progressive solving framework, sequentially using bidirectional breadth-first search to obtain an initial path, activating a hybrid greedy strategy when no solution is obtained, integrating five heuristic metrics to dynamically select local operations, and finally balancing exploration and exploitation through bidirectional Monte Carlo tree search to acquire high-quality solutions. Experiments on a specific DNA sequence dataset show that HBM outperforms existing methods in terms of reversal path length, time-quality cost-effectiveness, and solution quality stability.
文章引用:刘威伟, 王颖, 李云菲, 王希胤. HBM:基于双向蒙特卡洛树和混合贪心策略的序列反转求解方法[J]. 计算生物学, 2026, 16(1): 1-14. https://doi.org/10.12677/hjcb.2026.161001

参考文献

[1] Liu, Y., Yi, C., Fan, C., Liu, Q., Liu, S., Shen, L., et al. (2023) Pan-Centromere Reveals Widespread Centromere Repositioning of Soybean Genomes. Proceedings of the National Academy of Sciences, 120, e2310177120. [Google Scholar] [CrossRef
[2] Jiang, Z., Zhu, X., Wang, X., Chen, Q., Zhang, W., Li, Z., et al. (2025) Chromosomal-Level Genome Assembly of Trichogramma Japonicum (Hymenoptera: Trichogrammatidae). Scientific Data, 12, 1537-1537. [Google Scholar] [CrossRef
[3] 张志良. 基于反转和删除操作的基因组重组全解空间研究[D]: [硕士学位论文]. 济南: 山东大学, 2012.
[4] Siepel, A.C. (2003) An Algorithm to Enumerate Sorting Reversals for Signed Permutations. Journal of Computational Biology, 10, 575-597. [Google Scholar] [CrossRef
[5] 黄鑫, 张志佳, 孙平, 等. 基于深度强化学习的路径规划算法综述[J]. 机器人, 2026, 48(1): 196-216.
[6] García, A. (2025) Greedy Algorithms: A Review and Open Problems. Journal of Inequalities and Applications, 2025, Article No. 11. [Google Scholar] [CrossRef
[7] https://link.cnki.net/urlid/11.2127.TP.20251128.0816.002
[8] Petersen, J.E., Kapur, S., Gkantonas, S., Mastorakos, E. and Giusti, A. (2023) Modelling and Optimisation of Extinction Actions for Wildfire Suppression. Combustion Science and Technology, 195, 3584-3595. [Google Scholar] [CrossRef
[9] Wei, L., Yi, L., Yan, M., et al. (2023) A Self-Learning Monte Carlo Tree Search Algorithm for Robot Path Planning. Frontiers in Neurorobotics, 17, Article ID: 1039644.
[10] Dai, W., Wang, J., Chen, Y., Wang, X., Yang, X. and Zhang, H. (2026) An Improved PSO for Operational Optimization of Steam Power Systems in Petrochemical Enterprises. Expert Systems with Applications, 297, Article ID: 129404. [Google Scholar] [CrossRef
[11] 张泽宁. 基于Alphago-Zero及图扩散模型的图构造方法研究[D]: [硕士学位论文]. 北京: 北京交通大学, 2024.
[12] 李国良, 王磊, 张金玉, 等. GaussDB: 智能云原生分布式数据库[J]. 电子学报, 2025, 53(4): 1103-1122.
[13] Baudet, C. and Dias, Z. (2010) An Improved Algorithm to Enumerate All Traces that Sort a Signed Permutation by Reversals. Proceedings of the 2010 ACM Symposium on Applied Computing, Sierre, 22-26 March 2010, 1521-1525. [Google Scholar] [CrossRef
[14] Christie, D.A. and Irving, R.W. (2001) Sorting Strings by Reversals and by Transpositions. SIAM Journal on Discrete Mathematics, 14, 193-206. [Google Scholar] [CrossRef
[15] Bian, J., Li, J. and Wang, G. (2024) Automatic Move Program for Two-Player Military Chess Based on UCB&MCTS Algorithm and Move Evaluation. International Core Journal of Engineer-ing, 10, 388-396.
[16] Tian, C. (2023) Monte-Carlo Tree Search with Epsilon-Greedy for Game of Amazons. In: ITM Depart-ment, Illinois Institute of Technology, Eds., Proceedings of the 3rd International Conference on Signal Processing and Machine Learning (Part 3), Faculty of Engineering, Architecture & Information Technology, University of Queensland, 420-425.
[17] 王孟珂. 2,4-表油菜素内酯对银杏类黄酮代谢的影响及相关基因的功能研究[D]: [硕士学位论文]. 南京: 南京林业大学, 2022.
[18] Bader, D.A., Moret, B.M.E. and Yan, M. (2001) A Linear-Time Algorithm for Computing Inversion Distance be-tween Signed Permutations with an Experimental Study. Journal of Computational Biology, 8, 483-491. [Google Scholar] [CrossRef
[19] Fang, B., Lai, J., Liu, Y., Liu, L., Yu, X., Li, X., et al. (2024) Hybrid Sequencing for Detailed Genetic Characterization of Human Adenoviruses. Scientific Reports, 14, Article No. 29490. [Google Scholar] [CrossRef
[20] 王学. 基因组中最大唯一匹配的查找算法研究[D]: [硕士学位论文]. 西安: 西安电子科技大学, 2009.