基于机器学习的假尿苷位点预测的研究进展
Research Progress of Pseudouridine Site Prediction Based on Machine Learning
DOI: 10.12677/HJBM.2022.122014, PDF,   
作者: 孟 蕊:辽宁科技大学计算机与软件工程学院,辽宁 鞍山
关键词: RNA修饰假尿苷位点预测机器学习RNA Modification Pseudouridine Site Prediction Machine Learning
摘要: 在基因的转录过程中,RNA很容易发生修饰的现象。迄今为止,研究人员已经发现了一百多种RNA的修饰,而假尿苷(ψ)是第一个被发现的,并且是目前存在最广泛的一种RNA修饰。近年来,随着表观遗传学研究的深入,关于假尿苷的研究越来越多。假尿苷修饰对于各种细胞生物和生理过程是至关重要的,研究的关键步骤就是在转录组中准确地识别出假尿苷的位点。由于实验化学方法来识别假尿苷位点耗时耗力,基于机器学习的计算方法来识别假尿苷位点是如今最好的选择。本文回顾了基于机器学习的假尿苷位点预测的研究现状,调查了研究人员在位点预测过程中使用的数据集和评估方法,得到了假尿苷位点预测的最新进展。本文选取具有代表性的几个机器学习模型进行简要概述,并对目前的局限性给出一些建议。
Abstract: RNA is easily modified in the process of gene transcription. To date, researchers have found more than a hundred RNA modifications, and pseudouridine (ψ) was the first to be discovered and is the most widely available type of RNA modification. In recent years, with the development of epigenetics, more and more studies on pseudouridine have been conducted. Pseudouridine modification is essential for various cellular biological and physiological processes, and the key step of the study is to accurately identify pseudouridine sites in the transcriptome. Since it is time-consuming and labor-intensive to identify pseudouridine sites by experimental chemical methods, the computational method based on machine learning is the best choice to identify pseudouridine sites today. This paper reviews the research status of pseudouridine site prediction based on machine learning, investigates the data sets and evaluation methods used by researchers in the process of site prediction, and gets the latest progress on pseudouridine site prediction. In this paper, several representative machine learning models are selected to give a brief overview and provide some suggestions on the current limitations.
文章引用:孟蕊. 基于机器学习的假尿苷位点预测的研究进展[J]. 生物医学, 2022, 12(2): 109-115. https://doi.org/10.12677/HJBM.2022.122014

参考文献

[1] Ge, J. and Yu, Y.-T. (2013) RNA Pseudouridylation: New Insights into an Old Modification. Trends in Biochemical Sciences, 38, 210-218. [Google Scholar] [CrossRef] [PubMed]
[2] Charette, M. and Gray, M.W. (2000) Pseudouridine in RNA: What, Where, How, and Why. IUBMB Life, 49, 341-352. [Google Scholar] [CrossRef] [PubMed]
[3] Davis, D.R., Veltri, C.A. and Nielsen, L. (1998) An RNA Model System for Investigation of Pseudouridine Stabilization of the Codonanticodon Interaction in tRNALys, tRNAHis and tRNATyr. Journal of Biomolecular Structure and Dynamics, 15, 1121-1132. [Google Scholar] [CrossRef] [PubMed]
[4] Basak, A. and Query, C.C. (2014) A Pseudouridine Residue in the Spliceosome Core Is Part of the Filamentous Growth Program in Yeast. Cell Reports, 8, 966-973. [Google Scholar] [CrossRef] [PubMed]
[5] Jack, K., Bellodi, C., Landry, D.M., Niederer, R.O., Meskauskas, A., Musalgaonkar, S., et al. (2011) rRNA Pseudouridylation Defects Affect Ribosomal Ligand Binding and Translational Fidelity from Yeast to Human Cells. Molecular Cell, 44, 660-666. [Google Scholar] [CrossRef] [PubMed]
[6] Ma, X., Zhao, X. and Yu, Y.T. (2003) Pseudouridylation (Ψ) of U2 snRNA in S. cerevisiae Is Catalyzed by an RNA-Inde- pendent Mechanism. The EMBO Journal, 22, 1889-1897. [Google Scholar] [CrossRef] [PubMed]
[7] Carlile, T.M., Rojas-Duran, M.F., Zinshteyn, B., Shin, H., Bartoli, K.M., Gilbert, W.V., et al. (2014) Pseudouridine Profiling Reveals Regulated mRNA Pseudouridylation in Yeast and Human Cells. Nature, 515, 143-146. [Google Scholar] [CrossRef] [PubMed]
[8] 毕月. 基于机器学习的RNA相关功能位点研究[D]: [硕士学位论文]. 大连: 大连海事大学, 2020.[CrossRef
[9] Schwartz, S., Bernstein, D.A., Mumbach, M.R., Jovanovic, M., Herbst, R.H. and León-Ricardo, B.X. (2014) Transcriptome- Wide Mapping Reveals Widespread Dynamic-Regulated Pseudouridylation of ncRNA and mRNA. Cell, 159, 148-162. [Google Scholar] [CrossRef] [PubMed]
[10] Li, X.Y., Zhu, P., Ma, S.Q., Song, J.H., Bai, J.Y., Sun, F.F., et al. (2015) Chemicalpulldown Reveals Dynamic Pseudouridy- lation of the Mammalian Transcriptome. Nature Chemical Biology, 11, 592-597. [Google Scholar] [CrossRef] [PubMed]
[11] Chen, W., Tang, H., Ye, J., Lin, H. and Chou, K.C. (2016) iRNA-PseU: Identifying RNA Pseudouridine Sites. Molecular Therapy: Nucleic Acids, 5, Article ID: e332
[12] Liu, K., Chen, W. and Lin, H. (2020) XG-PseU: An Extreme Gradient Boosting-Based Method for Identifying Pseudouri- dine Sites. Molecular Genetics and Genomics, 295, 13-21. [Google Scholar] [CrossRef] [PubMed]
[13] Lv, Z., Zhang, J., Ding, H. and Zou, Q. (2020) RF-PseU: A Random Forest Predictor for RNA Pseudouridine Sites. Frontiers in Bioengineering and Biotechnology, 8, Article No. 134. https://www.frontiersin.org/article/10.3389/fbioe.2020.00134. [Google Scholar] [CrossRef] [PubMed]
[14] Cheng, L., Hu, Y., Sun, J., Zhou, M. and Jiang, Q. (2018) DincRNA: A Comprehensive Web-Based Bioinformatics Toolkit for Exploring Disease Associations and ncRNA Function. Bioinformatics, 34, 1953-1956. [Google Scholar] [CrossRef] [PubMed]
[15] Li, F., Guo, X., Jin, P., Chen, J., Xiang, D., Song, J. and Coin, L.J.M. (2021) Porpoise: A New Approach for Accurate Prediction of RNA Pseudouridine Sites, Briefings in Bioinformatics, 22, Article No. bbab245. [Google Scholar] [CrossRef] [PubMed]
[16] Wang, X., Lin, X., Wang, R., Han, N., Fan, K., Han, L. and Ding, Z. (2021) A Feature Fusion Predictor for RNA Pseudouri- dine Sites with Particle Swarm Optimizer Based Feature Selection and Ensemble Learning Approach. Current Issues in Molecular Biology, 43, 1844-1858. [Google Scholar] [CrossRef] [PubMed]