基于跨细胞类型差异分析的复杂样本反卷积
Deconvolution of Complex Samples Based on Cross-Cell Type Differential Analysis
DOI: 10.12677/aam.2024.1311468, PDF,    科研立项经费支持
作者: 田中禾, 彭 凌, 张伟伟*:绍兴文理学院数理信息学院,浙江 绍兴
关键词: 复杂样本细胞组成差异表达分析跨细胞类型反卷积Complex Sample Cell Composition Differential Analysis Cross-Cell Type Deconvolution
摘要: 复杂样本的基因组数据反映了多种细胞类型表达的平均水平。然而,细胞组成的差异会导致许多相关分析结果产生偏差。因此,准确估计细胞组成是分析复杂样本的第一步。目前有许多计算方法已经被开发出来估计复杂样本的细胞组成,但由于缺乏参考数据和先验信息,它们的应用大多有限。针对此问题,本文开发了一种基于最优特征选择的无参考反卷积算法,该算法通过整合一种细胞类型和其他细胞类型之间的跨细胞类型差异表达分析,以及两种细胞类型与其他细胞类型之间的跨细胞类型差异表达分析,迭代地搜索细胞类型特异性的最优特征,并进行细胞组成估计。模拟研究和两个真实数据集上的实证分析表明我们的算法具有更好的性能。
Abstract: Genomic data from complex samples measure average level of multiple cell types, and differences in cell compositions lead to biased results in many relevant analyses. Therefore, accurately estimating cell compositions has been recognized as an important first step in analyzing complex samples. Many computational methods have been developed to estimate cell compositions, but they mostly have limited applications due to a lack of reference or prior information. In this work, we develop a feature selection method for reference-free deconvolution, which iteratively searches for cell-type specific features by cross-cell type differential analysis between one cell type versus the other cell types, as well as between two cell types versus the other cell types, and performs composition estimation. Extensive simulation studies and analyses of two real datasets demonstrate the favorable performance of our proposed method.
文章引用:田中禾, 彭凌, 张伟伟. 基于跨细胞类型差异分析的复杂样本反卷积[J]. 应用数学进展, 2024, 13(11): 4870-4875. https://doi.org/10.12677/aam.2024.1311468

参考文献

[1] Zhang, W., Li, Z., Wei, N., Wu, H. and Zheng, X. (2019) Detection of Differentially Methylated CpG Sites between Tumor Samples with Uneven Tumor Purities. Bioinformatics, 36, 2017-2024. [Google Scholar] [CrossRef] [PubMed]
[2] Giannakopoulos, P., Herrmann, F.R., Bussière, T., Bouras, C., Kövari, E., Perl, D.P., et al. (2003) Tangle and Neuron Numbers, but Not Amyloid Load, Predict Cognitive Status in Alzheimer’s Disease. Neurology, 60, 1495-1500. [Google Scholar] [CrossRef] [PubMed]
[3] Yang, Y., Mufson, E.J. and Herrup, K. (2003) Neuronal Cell Death Is Preceded by Cell Cycle Events at All Stages of Alzheimer's Disease. The Journal of Neuroscience, 23, 2557-2563. [Google Scholar] [CrossRef] [PubMed]
[4] Li, B., Severson, E., Pignon, J., Zhao, H., Li, T., Novak, J., et al. (2016) Comprehensive Analyses of Tumor Immunity: Implications for Cancer Immunotherapy. Genome Biology, 17, Article No. 174. [Google Scholar] [CrossRef] [PubMed]
[5] Sturm, G., Finotello, F., Petitprez, F., Zhang, J.D., Baumbach, J., Fridman, W.H., et al. (2019) Comprehensive Evaluation of Transcriptome-Based Cell-Type Quantification Methods for Immuno-Oncology. Bioinformatics, 35, i436-i445. [Google Scholar] [CrossRef] [PubMed]
[6] Newman, A.M., Liu, C.L., Green, M.R., Gentles, A.J., Feng, W., Xu, Y., et al. (2015) Robust Enumeration of Cell Subsets from Tissue Expression Profiles. Nature Methods, 12, 453-457. [Google Scholar] [CrossRef] [PubMed]
[7] Teschendorff, A.E., Breeze, C.E., Zheng, S.C. and Beck, S. (2017) A Comparison of Reference-Based Algorithms for Correcting Cell-Type Heterogeneity in Epigenome-Wide Association Studies. BMC Bioinformatics, 18, Article No. 105. [Google Scholar] [CrossRef] [PubMed]
[8] Hattab, M.W., Shabalin, A.A., Clark, S.L., Zhao, M., Kumar, G., Chan, R.F., et al. (2017) Correcting for Cell-Type Effects in DNA Methylation Studies: Reference-Based Method Outperforms Latent Variable Approaches in Empirical Studies. Genome Biology, 18, Article No. 24. [Google Scholar] [CrossRef] [PubMed]
[9] Gong, T., Hartmann, N., Kohane, I.S., Brinkmann, V., Staedtler, F., Letzkus, M., et al. (2011) Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples. PLOS ONE, 6, e27156. [Google Scholar] [CrossRef] [PubMed]
[10] Houseman, E.A., Molitor, J. and Marsit, C.J. (2014) Reference-free Cell Mixture Adjustments in Analysis of DNA Methylation Data. Bioinformatics, 30, 1431-1439. [Google Scholar] [CrossRef] [PubMed]
[11] Rahmani, E., Schweiger, R., Shenhav, L., Wingert, T., Hofer, I., Gabel, E., et al. (2018) BayesCCE: A Bayesian Framework for Estimating Cell-Type Composition from DNA Methylation without the Need for Methylation Reference. Genome Biology, 19, Article No. 141. [Google Scholar] [CrossRef] [PubMed]
[12] Kang, K., Meng, Q., Shats, I., Umbach, D.M., Li, M., Li, Y., et al. (2019) CDSeq: A Novel Complete Deconvolution Method for Dissecting Heterogeneous Samples Using Gene Expression Data. PLOS Computational Biology, 15, e1007510. [Google Scholar] [CrossRef] [PubMed]
[13] Li, Z. and Wu, H. (2019) TOAST: Improving Reference-Free Cell Composition Estimation by Cross-Cell Type Differential Analysis. Genome Biology, 20, Article No. 190. [Google Scholar] [CrossRef] [PubMed]
[14] Rahmani, E., Zaitlen, N., Baran, Y., Eng, C., Hu, D., Galanter, J., et al. (2017) Correcting for Cell-Type Heterogeneity in DNA Methylation: A Comprehensive Evaluation. Nature Methods, 14, 218-219. [Google Scholar] [CrossRef] [PubMed]
[15] Houseman, E.A., Kile, M.L., Christiani, D.C., Ince, T.A., Kelsey, K.T. and Marsit, C.J. (2016) Reference-Free Deconvolution of DNA Methylation Data and Mediation by Cell Composition Effects. BMC Bioinformatics, 17, Article No. 259. [Google Scholar] [CrossRef] [PubMed]