CAPTLDA:基于胶囊网络和Transformer预测LncRNA-疾病关联
CAPTLDA: Predicting LncRNA-Disease Associations Based on Capsule Networks and Transformers
摘要: 长链非编码RNA (lncRNA)是一类长度超过200个核苷酸的转录物,在多种疾病的发病机制中发挥关键作用。因此,阐明lncRNA与疾病之间的关联对于理解潜在的发病机制和开发新的疾病预防、诊断和治疗策略至关重要。虽然传统的生物学实验对于预测长链非编码RNA-疾病关联(LDA)是有价值的,但往往费用高昂且耗时。开发有效的LDA预测计算模型是有必要的。当前的计算方法在有效整合多源数据和捕获异质生物网络中的复杂高阶关系模式方面经常遇到限制。这项研究提出了一种新的计算框架命名为CAPTLDA,将lncRNA、疾病和miRNA的相似性和关联整合到一个加权的异构网络邻接矩阵中,引入了胶囊网络,以增强特征学习。此外,还采用Transformer编码器,它结合了全局多头代理注意力机制和并行的多头局部注意力机制,以全面捕获全局依赖关系和局部上下文信息,最终实现准确的LDA预测。在两个基准数据集上进行的综合计算实验表明,模型在性能上优于先进的现有方法。案例研究进一步验证了它在识别潜在疾病相关lncRNA方面的有效性。
Abstract: Long non-coding RNAs (lncRNAs) are transcripts exceeding 200 nucleotides in length and play pivotal roles in the pathogenesis of various diseases. Therefore, elucidating the associations between lncRNAs and diseases is crucial for understanding underlying mechanisms and developing novel strategies for disease prevention, diagnosis, and treatment. While traditional biological experiments are valuable for predicting lncRNA-disease associations (LDA), they are often costly and time-consuming. Developing effective computational models for LDA prediction is therefore necessary. Current computational methods frequently encounter limitations in effectively integrating multi-source data and capturing complex higher-order relationship patterns in heterogeneous biological networks. This study proposes a novel computational framework named CAPTLDA, which integrates the similarities and associations of lncRNAs, diseases, and miRNAs into a weighted heterogeneous network adjacency matrix. A capsule network is introduced to enhance feature learning. Additionally, a Transformer encoder is employed, combining a global multi-head agent attention mechanism with parallel multi-head local attention mechanisms to comprehensively capture global dependencies and local contextual information, ultimately achieving accurate LDA prediction. Comprehensive computational experiments on two benchmark datasets demonstrate that our model outperforms existing advanced methods in performance. Case studies further validate its effectiveness in identifying potential disease-related lncRNAs.
文章引用:张嘉辉, 谭建军. CAPTLDA:基于胶囊网络和Transformer预测LncRNA-疾病关联[J]. 生物医学, 2026, 16(1): 11-24. https://doi.org/10.12677/hjbm.2026.161002

参考文献

[1] Chen, X., Yan, C., Luo, C., Ji, W., Zhang, Y. and Dai, Q. (2015) Constructing lncRNA Functional Similarity Network Based on LncRNA-Disease Associations and Disease Semantic Similarity. Scientific Reports, 5, Article No. 12. [Google Scholar] [CrossRef] [PubMed]
[2] Wapinski, O. and Chang, H.Y. (2011) Long Noncoding RNAs and Human Disease. Trends in Cell Biology, 21, 354-361. [Google Scholar] [CrossRef] [PubMed]
[3] Chen, X., Yan, C.C., Zhang, X. and You, Z. (2016) Long Non-Coding RNAs and Complex Diseases: From Experimental Results to Computational Models. Briefings in Bioinformatics, 18, 558-576. [Google Scholar] [CrossRef] [PubMed]
[4] Wang, S.Y., Chen, Z.Y., Gu, J.Y., Chen, X. and Wang, Z.X. (2021) The Role of LncRNA PCAT6 in Cancers. Frontiers in Oncology, 11, Article ID: 701495. [Google Scholar] [CrossRef] [PubMed]
[5] Congrains, A., Kamide, K., Oguro, R., Yasuda, O., Miyata, K., Yamamoto, E., et al. (2012) Genetic Variants at the 9p21 Locus Contribute to Atherosclerosis through Modulation of ANRIL and CDKN2A/B. Atherosclerosis, 220, 449-455. [Google Scholar] [CrossRef] [PubMed]
[6] Faghihi, M.A., Modarresi, F., Khalil, A.M., Wood, D.E., Sahagan, B.G., Morgan, T.E., et al. (2008) Expression of a Noncoding RNA Is Elevated in Alzheimer’s Disease and Drives Rapid Feed-Forward Regulation of β-Secretase. Nature Medicine, 14, 723-730. [Google Scholar] [CrossRef] [PubMed]
[7] Alvarez, M.L. and DiStefano, J.K. (2011) Functional Characterization of the Plasmacytoma Variant Translocation 1 Gene (PVT1) in Diabetic Nephropathy. PLOS ONE, 6, e18671. [Google Scholar] [CrossRef] [PubMed]
[8] Wu, D., Li, R.F., Liu, J.Y., Zhou, C.C. and Jia, R.P. (2022) Long Noncoding RNA LINC00467: Role in Various Human Cancers. Frontiers in Genetics, 13, Article ID: 892009. [Google Scholar] [CrossRef] [PubMed]
[9] Spagnolo, P., Kropski, J.A., Jones, M.G., Lee, J.S., Rossi, G., Karampitsakos, T., et al. (2021) Idiopathic Pulmonary Fibrosis: Disease Mechanisms and Drug Development. Pharmacology & Therapeutics, 222, Article ID: 107798. [Google Scholar] [CrossRef] [PubMed]
[10] Gavrilov, K. and Saltzman, W.M. (2012) Therapeutic siRNA: Principles, Challenges, and Strategies. The Yale Journal of Biology and Medicine, 85, 187-200.
[11] Sun, J., Shi, H.B., Wang, Z.Z., Zhang, C.J., Liu, L., Wang, L.T., et al. (2014) Inferring Novel LncRNA-Disease Associations Based on a Random Walk Model of a LncRNA Functional Similarity Network. Molecular Biosystems, 10, 2074-2081. [Google Scholar] [CrossRef] [PubMed]
[12] Chen, X. (2015) KATZLDA: KATZ Measure for the LncRNA-Disease Association Prediction. Scientific Reports, 5, Article No. 16840. [Google Scholar] [CrossRef] [PubMed]
[13] Ganegoda, G.U., Li, M., Wang, W. and Feng, Q. (2015) Heterogeneous Network Model to Infer Human Disease-Long Intergenic Non-Coding RNA Associations. IEEE Transactions on NanoBioscience, 14, 175-183. [Google Scholar] [CrossRef] [PubMed]
[14] Xi, J., Wang, M. and Li, A. (2017) Discovering Potential Driver Genes through an Integrated Model of Somatic Mutation Profiles and Gene Functional Information. Molecular BioSystems, 13, 2135-2144. [Google Scholar] [CrossRef] [PubMed]
[15] Lu, C., Yang, M., Luo, F., Wu, F., Li, M., Pan, Y., et al. (2018) Prediction of LncRNA-Disease Associations Based on Inductive Matrix Completion. Bioinformatics, 34, 3357-3364. [Google Scholar] [CrossRef] [PubMed]
[16] Fan, Y.X., Chen, M.J. and Pan, X.Y. (2021) GCRFLDA: Scoring LncRNA-Disease Associations Using Graph Convolution Matrix Completion with Conditional Random Field. Briefings in Bioinformatics, 23, bbab361. [Google Scholar] [CrossRef] [PubMed]
[17] Fu, G., Wang, J., Domeniconi, C. and Yu, G. (2017) Matrix Factorization-Based Data Fusion for the Prediction of LncRNA-Disease Associations. Bioinformatics, 34, 1529-1537. [Google Scholar] [CrossRef] [PubMed]
[18] Zhu, R., Wang, Y., Liu, J. and Dai, L. (2021) IPCARF: Improving LncRNA-Disease Association Prediction Using Incremental Principal Component Analysis Feature Selection and a Random Forest Classifier. BMC Bioinformatics, 22, Article No. 175. [Google Scholar] [CrossRef] [PubMed]
[19] Xuan, P., Cao, Y., Zhang, T., Kong, R. and Zhang, Z. (2019) Dual Convolutional Neural Networks with Attention Mechanisms Based Method for Predicting Disease-Related LncRNA Genes. Frontiers in Genetics, 10, Article No. 416. [Google Scholar] [CrossRef] [PubMed]
[20] Xuan, P., Pan, S., Zhang, T., Liu, Y. and Sun, H. (2019) Graph Convolutional Network and Convolutional Neural Network Based Method for Predicting LncRNA-Disease Associations. Cells, 8, Article No. 1012. [Google Scholar] [CrossRef] [PubMed]
[21] Wang, L. and Zhong, C. (2022) gGATLDA: LncRNA-Disease Association Prediction Based on Graph-Level Graph Attention Network. BMC Bioinformatics, 23, Article No. 11. [Google Scholar] [CrossRef] [PubMed]
[22] Xuan, P., Zhan, L., Cui, H., Zhang, T., Nakaguchi, T. and Zhang, W. (2022) Graph Triple-Attention Network for Disease-Related LncRNA Prediction. IEEE Journal of Biomedical and Health Informatics, 26, 2839-2849. [Google Scholar] [CrossRef] [PubMed]
[23] Sabour, S., Frosst, N. and Hinton, G.E. (2017) Dynamic Routing between Capsules.
[24] Vaswani, A., et al. (2017) Attention Is All You Need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 6000-6010.
[25] Li, G.H., Bai, P.H., Liang, C. and Luo, J.W. (2024) Node-Adaptive Graph Transformer with Structural Encoding for Accurate and Robust LncRNA-Disease Association Prediction. BMC Genomics, 25, Article No. 73. [Google Scholar] [CrossRef] [PubMed]
[26] Han, D., Ye, T., Han, Y., Xia, Z., Pan, S., Wan, P., et al. (2024) Agent Attention: On the Integration of Softmax and Linear Attention. In: Leonardis, A., et al., Eds., Computer VisionECCV 2024, Springer, 124-140. [Google Scholar] [CrossRef
[27] Luong, T., Pham, H. and Manning, C.D. (2015) Effective Approaches to Attention-Based Neural Machine Translation. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, September 2015, 1412-1421. [Google Scholar] [CrossRef
[28] Ning, S., Zhang, J., Wang, P., Zhi, H., Wang, J., Liu, Y., et al. (2015) Lnc2Cancer: A Manually Curated Database of Experimentally Supported lncRNAs Associated with Various Human Cancers. Nucleic Acids Research, 44, D980-D985. [Google Scholar] [CrossRef] [PubMed]
[29] Chen, G., Wang, Z.Y., Wang, D.Q., Qiu, C.X., Liu, M.X., et al. (2012) LncRNADisease: A Database for Long-Non-Coding RNA-Associated Diseases. Nucleic Acids Research, 41, D983-D986. [Google Scholar] [CrossRef] [PubMed]
[30] Lu, Z.Y., Cohen, K.B. and Hunter, L. (2007) GeneRIF Quality Assurance as Summary Revision. In: 13th Pacific Symposium on Biocomputing (PSB), World Scientific, 269.
[31] Li, J., Liu, S., Zhou, H., Qu, L. and Yang, J. (2013) starBase v2.0: Decoding miRNA-ceRNA, miRNA-ncRNA and Protein-RNA Interaction Networks from Large-Scale Clip-Seq Data. Nucleic Acids Research, 42, D92-D97. [Google Scholar] [CrossRef] [PubMed]
[32] Li, Y., Qiu, C., Tu, J., Geng, B., Yang, J., Jiang, T., et al. (2013) HMDD V2.0: A Database for Experimentally Supported Human MicroRNA and Disease Associations. Nucleic Acids Research, 42, D1070-D1074. [Google Scholar] [CrossRef] [PubMed]
[33] Zhou, Y., Wang, X., Yao, L. and Zhu, M. (2022) Ldaformer: Predicting LncRNA-Disease Associations Based on Topological Feature Extraction and Transformer Encoder. Briefings in Bioinformatics, 23, bbac370. [Google Scholar] [CrossRef] [PubMed]
[34] Gao, Y., Shang, S., Guo, S., Li, X., Zhou, H., Liu, H., et al. (2020) Lnc2cancer 3.0: An Updated Resource for Experimentally Supported lncRNA/circRNA Cancer Associations and Web Tools Based on RNA-Seq and ScRNA-Seq Data. Nucleic Acids Research, 49, D1251-D1258. [Google Scholar] [CrossRef] [PubMed]
[35] Bao, Z., Yang, Z., Huang, Z., Zhou, Y., Cui, Q. and Dong, D. (2018) LncRNADisease 2.0: An Updated Database of Long Non-Coding RNA-Associated Diseases. Nucleic Acids Research, 47, D1034-D1037. [Google Scholar] [CrossRef] [PubMed]
[36] Huang, Z., Shi, J., Gao, Y., Cui, C., Zhang, S., Li, J., et al. (2018) HMDD V3.0: A Database for Experimentally Supported Human MicroRNA-Disease Associations. Nucleic Acids Research, 47, D1013-D1017. [Google Scholar] [CrossRef] [PubMed]
[37] Wang, D., Wang, J., Lu, M., Song, F. and Cui, Q. (2010) Inferring the Human MicroRNA Functional Similarity and Functional Network Based on MicroRNA-Associated Diseases. Bioinformatics, 26, 1644-1650. [Google Scholar] [CrossRef] [PubMed]
[38] Kozomara, A., Birgaoanu, M. and Griffiths-Jones, S. (2018) miRBase: From MicroRNA Sequences to Function. Nucleic Acids Research, 47, D155-D162. [Google Scholar] [CrossRef] [PubMed]
[39] Xuan, P., Han, K., Guo, M., Guo, Y., Li, J., Ding, J., et al. (2013) Correction: Prediction of MicroRNAs Associated with Human Diseases Based on Weighted K Most Similar Neighbors. PLOS ONE, 8, No. 9. [Google Scholar] [CrossRef] [PubMed]
[40] Wang, J.Z., Du, Z., Payattakool, R., Yu, P.S. and Chen, C. (2007) A New Method to Measure the Semantic Similarity of GO Terms. Bioinformatics, 23, 1274-1281. [Google Scholar] [CrossRef] [PubMed]
[41] Shi, Z., Zhang, H., Jin, C., Quan, X. and Yin, Y. (2021) A Representation Learning Model Based on Variational Inference and Graph Autoencoder for Predicting LncRNA-Disease Associations. BMC Bioinformatics, 22, Article No. 136. [Google Scholar] [CrossRef] [PubMed]
[42] Fu, L.Y., Yao, Z.Y., Zhou, Y.Y., Peng, Q.K. and Lyu, H.Q. (2024) ACLNDA: An Asymmetric Graph Contrastive Learning Framework for Predicting Noncoding RNA-Disease Associations in Heterogeneous Graphs. Briefings in Bioinformatics, 25, bbae533. [Google Scholar] [CrossRef] [PubMed]
[43] Huang, L., Sheng, N., Gao, L., Wang, L., Hou, W., Hong, J., et al. (2025) Self-Supervised Contrastive Learning on Attribute and Topology Graphs for Predicting Relationships among lncRNAs, miRNAs and Diseases. IEEE Journal of Biomedical and Health Informatics, 29, 657-668. [Google Scholar] [CrossRef] [PubMed]
[44] Chen, Y., Zou, S., Xu, L., Chen, J., Wang, L., Shen, Y., et al. (2025) NSMCE2 Promotes the Occurrence and Development of HCC by Regulating the SUMOylation of PPARα. International Immunopharmacology, 157, Article ID: 114762. [Google Scholar] [CrossRef] [PubMed]
[45] El-Serag, H.B. and Rudolph, K.L. (2007) Hepatocellular Carcinoma: Epidemiology and Molecular Carcinogenesis. Gastroenterology, 132, 2557-2576. [Google Scholar] [CrossRef] [PubMed]
[46] Lin, X.Q., Huang, Z.M., Chen, X., Wu, F. and Wu, W. (2018) XIST Induced by JPX Suppresses Hepatocellular Carcinoma by Sponging miR-155-5p. Yonsei Medical Journal, 59, 816-826. [Google Scholar] [CrossRef] [PubMed]
[47] Nolan, E., Lindeman, G.J. and Visvader, J.E. (2023) Deciphering Breast Cancer: From Biology to the Clinic. Cell, 186, 1708-1728. [Google Scholar] [CrossRef] [PubMed]
[48] Qu, H., Li, X., Chen, F., Zhang, M., Lu, X., Gu, Y., et al. (2023) LncRNA PVT1 Influences Breast Cancer Cells Glycolysis through Sponging miR-145-5p. Genes & Genomics, 45, 581-592. [Google Scholar] [CrossRef] [PubMed]