基于甲基化BS-seq数据的肿瘤异质性分解
Tumor Heterogeneity Decomposition Based on Methylation BS-seq Data
摘要: 肿瘤组织的细胞异质性是干扰表观基因组下游分析的关键因素。现有基于DNA甲基化重亚硫酸盐测序(BS-seq)数据的反卷积算法多局限于“正常–肿瘤”二元成分假设,难以准确解析复杂的肿瘤微环境。为此,本文提出一种基于极大似然估计与期望最大化(EM)算法的统计推断模型MethEML。该模型仅需肿瘤混合组织的甲基化谱数据,即可在无外部参考样本的条件下,联合推断多种细胞亚群的混合比例及各亚群特异性甲基化谱,并引入贝叶斯信息准则(BIC)实现亚群数量 K 的自适应确定。基于真实细胞系(HCC1954与HMEC)构建的模拟数据集实验表明,MethEML突破了现有主流算法MethylPurify在细胞类型数量( K=2 )上的应用限制。在不同测序覆盖度与混合比例的测试场景下,MethEML的预测精度显著优于对比算法,且展现出更低的均方误差与更强的鲁棒性,为精准解析肿瘤微环境异质性提供了高效的计算工具。
Abstract: Tumor cellular heterogeneity is a critical factor that confounds the downstream analysis of the epigenome. Current deconvolution algorithms based on DNA bisulfite sequencing (BS-seq) data frequently rely on a simplified normal-tumor binary composition hypothesis, which fails to accurately resolve the complexities of the tumor microenvironment. To address this challenge, we developed MethEML, a statistical inference model based on maximum likelihood estimation and the expectation-maximization (EM) algorithm. Without requiring external reference samples, MethEML can simultaneously infer the mixing proportions of multiple cell subpopulations and their corresponding cell-type-specific methylation profiles directly from the methylation data of bulk tumor samples. Furthermore, the model incorporates the Bayesian Information Criterion (BIC) to adaptively determine the optimal number of subpopulations ( K ). Experiments conducted on simulated datasets derived from real cell lines (HCC1954 and HMEC) demonstrate that MethEML circumvents the inherent limitation of the mainstream algorithm MethylPurify, which is restricted to a binary cell-type composition ( K=2 ). Under various scenarios of sequencing coverage and mixing proportions, MethEML significantly outperforms the baseline algorithm, exhibiting lower mean square error and superior robustness. This study provides an efficient computational tool for the precise characterization of tumor microenvironment heterogeneity.
文章引用:杨琴, 张伟伟. 基于甲基化BS-seq数据的肿瘤异质性分解[J]. 应用数学进展, 2026, 15(5): 96-102. https://doi.org/10.12677/aam.2026.155211

参考文献

[1] Xu, Y.L., Ma, S.Y., Xu, M.Y., Zhu, H., Wang, Y., Dong, W., et al. (2025) DNA Methylation Heterogeneity in Complex Tumor Microenvironment: Quantitative Methods, Influencing Factors, and Clinical Implications. Genes & Diseases, 13, Article ID: 101832. [Google Scholar] [CrossRef
[2] Zhou, Y., Liu, J., Shi, B., Ma, T., Yu, P., Li, J., et al. (2025) Evaluation of Pan-Cancer Immune Heterogeneity Based on DNA Methylation. Genes, 16, Article No. 160. [Google Scholar] [CrossRef] [PubMed]
[3] Ferro dos Santos, M.R., Giuili, E., De Koker, A., Everaert, C. and De Preter, K. (2024) Computational Deconvolution of DNA Methylation Data from Mixed DNA Samples. Briefings in Bioinformatics, 25, bbae234. [Google Scholar] [CrossRef] [PubMed]
[4] Ma, S., Pan, X., Gan, J., Guo, X., He, J., Hu, H., et al. (2024) DNA Methylation Heterogeneity Attributable to a Complex Tumor Immune Microenvironment Prompts Prognostic Risk in Glioma. Epigenetics, 19, Article ID: 2318506. [Google Scholar] [CrossRef] [PubMed]
[5] Dietrich, A., Willruth, L.L., Pürckhauer, K., et al. (2025) Unifying DNA Methylation-Based in Silico Cell-Type Deconvolution with deconvMe. Bioinformatics Advances, 5, vbaf201.
[6] Li, L.Y. and Sun, Y.L. (2024) Circulating Tumor DNA Methylation Detection as Biomarker and Its Application in Tumor Liquid Biopsy: Advances and Challenges. MedComm, 5, e766. [Google Scholar] [CrossRef] [PubMed]
[7] Zhang, Y., Naderi Yeganeh, P., Zhang, H., Wang, S.Y., Li, Z., Gu, B., et al. (2024) Tumor Editing Suppresses Innate and Adaptive Antitumor Immunity and Is Reversed by Inhibiting DNA Methylation. Nature Immunology, 25, 1858-1870. [Google Scholar] [CrossRef] [PubMed]
[8] Rendek, T., Pos, O., Duranova, T., Saade, R., Budis, J., Repiska, V., et al. (2024) Current Challenges of Methylation-Based Liquid Biopsies in Cancer Diagnostics. Cancers, 16, Article No. 2001. [Google Scholar] [CrossRef] [PubMed]
[9] Cai, M., Zhou, J., McKennan, C. and Wang, J. (2024) scMD Facilitates Cell Type Deconvolution Using Single-Cell DNA Methylation References. Communications Biology, 7, Article No. 1. [Google Scholar] [CrossRef] [PubMed]
[10] Qi, T., Lakshmanan, L.N., Yang, Y., Zhou, Y., Pan, M., Skanderup, A.J., et al. (2025) Read-Level DNA Methylation Deconvolution Enhances Circulating Tumor DNA Detection. Briefings in Bioinformatics, 26, bbaf551. [Google Scholar] [CrossRef
[11] Wang, Y.X., Li, J.Y., Li, J.Q., et al. (2025) cfDecon: Accurate and Interpretable Methylation-Based Cell Type Deconvolution for Cell-Free DNA.
[12] Zheng, X., Zhao, Q., Wu, H., Li, W., Wang, H., Meyer, C.A., et al. (2014) MethylPurify: Tumor Purity Deconvolution and Differential Methylation Detection from Single Tumor DNA Methylomes. Genome Biology, 15, Article No. 419. [Google Scholar] [CrossRef] [PubMed]
[13] Zhao, P.P. and Hu, R. (2025) Application of Circulating Tumor DNA Methylation Characteristics in Early Diagnosis and Prognosis Monitoring of Lung Cancer. American Journal of Translational Research, 17, 8939-8952. [Google Scholar] [CrossRef
[14] Zhou, S., Yin, H., Yan, L., Xie, N. and Fu, C. (2025) ctDNA Methylation Profiling Reveals NBL1 as a Promising Biomarker for Early Ovarian Cancer Screening. World Journal of Surgical Oncology, 23, Article No. 305. [Google Scholar] [CrossRef] [PubMed]
[15] Liang, S.I., Quandt, Z., Wienke, S., Wang, J., Gordon, S., Barnett, R.M., et al. (2025) Methylation-Based ctDNA Tumor Fraction Changes Predict Long-Term Clinical Benefit from Immune Checkpoint Inhibitors in RADIOHEAD, a Real-World Pan-Cancer Study. Cancer Research Communications, 5, 1384-1395. [Google Scholar] [CrossRef] [PubMed]