基于图正则化非负矩阵分解的scATAC-seq数据分析
Analyzing Single-Cell ATAC-seq Data with Graph Regularized Non-Negative Matrix Factorization
摘要: 针对scATAC-seq数据的稀疏性与噪声问题,本研究提出一种融合图正则化与核范数约束的非负矩阵分解模型。该模型通过矩阵分解恢复全局信息,并利用细胞相似性图保持局部流形结构,从而在低维空间中实现生物一致性表示。实验表明,本方法能有效恢复缺失值,并在聚类与可视化中揭示更清晰的细胞亚群,提升下游分析性能。
Abstract: To address the challenges of sparsity and technical noise in single-cell ATAC sequencing data, this study proposes a non-negative matrix factorization method integrated with graph regularization and nuclear norm constraint. This approach restores global information through low-rank decomposition and preserves the local manifold structure by leveraging a cell similarity graph, thereby achieving biologically consistent representations in the low-dimensional space. Experiments demonstrate that our method effectively recovers missing values, reveals clearer cell subpopulations in clustering and visualization tasks, and enhances the performance of downstream scATAC-seq data analysis.
文章引用:张焱杰. 基于图正则化非负矩阵分解的scATAC-seq数据分析[J]. 应用数学进展, 2026, 15(4): 271-285. https://doi.org/10.12677/aam.2026.154156

参考文献

[1] Song, L. and Crawford, G.E. (2010) DNase-seq: A High-Resolution Technique for Mapping Active Gene Regulatory Elements across the Genome from Mammalian Cells. Cold Spring Harbor Protocols, 2010, pdb.prot5384. [Google Scholar] [CrossRef] [PubMed]
[2] Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y. and Greenleaf, W.J. (2013) Transposition of Native Chromatin for Fast and Sensitive Epigenomic Profiling of Open Chromatin, DNA-Binding Proteins and Nucleosome Position. Nature Methods, 10, 1213-1218. [Google Scholar] [CrossRef] [PubMed]
[3] Schep, A.N., Buenrostro, J.D., Denny, S.K., Schwartz, K., Sherlock, G. and Greenleaf, W.J. (2015) Structured Nucleosome Fingerprints Enable High-Resolution Mapping of Chromatin Architecture within Regulatory Regions. Genome Research, 25, 1757-1770. [Google Scholar] [CrossRef] [PubMed]
[4] Li, Z., Schulz, M.H., Look, T., Begemann, M., Zenke, M. and Costa, I.G. (2019) Identification of Transcription Factor Binding Sites Using ATAC-seq. Genome Biology, 20, Article No. 45. [Google Scholar] [CrossRef] [PubMed]
[5] Buenrostro, J.D., Wu, B., Litzenburger, U.M., Ruff, D., Gonzales, M.L., Snyder, M.P., et al. (2015) Single-Cell Chromatin Accessibility Reveals Principles of Regulatory Variation. Nature, 523, 486-490. [Google Scholar] [CrossRef] [PubMed]
[6] Zamanighomi, M., Lin, Z., Daley, T., Chen, X., Duren, Z., Schep, A., et al. (2018) Unsupervised Clustering and Epigenetic Classification of Single Cells. Nature Communications, 9, Article No. 2410. [Google Scholar] [CrossRef] [PubMed]
[7] Schep, A.N., Wu, B., Buenrostro, J.D. and Greenleaf, W.J. (2017) chromVAR: Inferring Transcription-Factor-Associated Accessibility from Single-Cell Epigenomic Data. Nature Methods, 14, 975-978. [Google Scholar] [CrossRef] [PubMed]
[8] Pliner, H.A., Packer, J.S., McFaline-Figueroa, J.L., Cusanovich, D.A., Daza, R.M., Aghamirzaie, D., et al. (2018) Cicero Predicts Cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. Molecular Cell, 71, 858-871.e8. [Google Scholar] [CrossRef] [PubMed]
[9] Satopaa, V., Albrecht, J., Irwin, D. and Raghavan, B. (2011) Finding a “Kneedle” in a Haystack: Detecting Knee Points in System Behavior. 2011 31st International Conference on Distributed Computing Systems Workshops, Minneapolis, 20-24 June 2011, 166-171. [Google Scholar] [CrossRef
[10] Hsieh, C. and Dhillon, I.S. (2011) Fast Coordinate Descent Methods with Variable Selection for Non-Negative Matrix Factorization. Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, 21-24 August 2011, 1064-1072. [Google Scholar] [CrossRef