基于t-SNE与UMAP降维的细胞分类及差异化基因筛选研究
Cell Classification and Differential Gene Screening Based on t-SNE and UMAP Dimension Reduction
DOI: 10.12677/AAM.2022.1110737, PDF,  被引量    国家自然科学基金支持
作者: 李元夫:南京信息工程大学数学与统计学院,江苏 南京
关键词: 单细胞RNA测序t-SNEUMAP显著差异化基因Single-Cell RNA Sequencing t-SNE UMAP Significantly Differentiated Genes
摘要: 单细胞RNA测序技术已经广泛地应用于细胞异质性等关键生物学问题的研究中,与此同时该技术的发展也为基因数据分析提出了很大的挑战。本文基于t-SNE和UMAP两种非线性降维方法,对单细胞RNA数据进行降维、聚类并与线性主成分降维聚类结果进行对比,得出结论:UMAP方法针对单细胞RNA数据降维聚类的效果更为理想。最后以UMAP非线性降维聚类的结果为例筛选出不同细胞类别中的显著差异化基因。
Abstract: Single-cell RNA sequencing technology has been widely used in key biological problems such as cell heterogeneity, and at the same time, the development of this technology also poses great challenges in gene data analysis. In this paper, based on two nonlinear dimensionality reduction methods, t-SNE and UMAP, the dimensionality reduction and clustering of single-cell RNA data were carried out and compared with the results of linear principal component dimensionality reduction cluster-ing. The conclusion was drawn that the UMAP method was more ideal for the dimensionality reduc-tion clustering of single-cell RNA data. Finally, the results of UMAP nonlinear dimensionality reduc-tion clustering were taken as an example to screen out the significantly differentiated genes in dif-ferent cell categories.
文章引用:李元夫. 基于t-SNE与UMAP降维的细胞分类及差异化基因筛选研究[J]. 应用数学进展, 2022, 11(10): 6951-6958. https://doi.org/10.12677/AAM.2022.1110737

参考文献

[1] Kiselev, V.Y., Kirschner, K., Schaub, M.T., et al. (2017) SC3: Consensus Clustering of Single-Cell RNA-Seq Data. Nature Methods, 14, 483-486. [Google Scholar] [CrossRef] [PubMed]
[2] Guo, M., Wang, H., Potter, S.S., et al. (2015) SINCERA: A Pipeline for Single-Cell RNA-Seq Profiling Analysis. PLOS Computational Biology, 11, e1004575. [Google Scholar] [CrossRef] [PubMed]
[3] Yang, L., Liu, J., Lu, Q., et al. (2017) SAIC: An Iter-ative Clustering Approach for Analysis of Single Cell RNA-Seq Data. BMC Genomics, 18, 689-697. [Google Scholar] [CrossRef] [PubMed]
[4] Van der Maaten, L. and Hinton, G. (2008) Visualizing Data Using t-SNE. Journal of Machine Learning Research, 9, 2679-2605.
[5] Wang, Y., Huang, H, Rudin, C, et al. (2021) Un-derstanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMAP, and PaCMAP for Data Visualization. Journal of Machine Learning Research, 22, 1-73. [Google Scholar] [CrossRef
[6] 顾君垚, 丁强, 夏宇栋, 江爱朋, 丁晓雯. 基于UMAP-AdamDD的冷水机组故障诊断方法[J]. 低温与超导, 2022, 50(1): 81-87.
[7] http://mas.ruc.edu.cn/syxwlm/MASkx/5da681cd2206452ebebc141ff5121548.htm
[8] Kiselev, V.Y., An-drews, T.S. and Hemberg, M. (2019) Challenges in Unsupervised Clustering of Single-Cell RNA-Seq Data. Nature Re-views Genetics, 20, 273-282. [Google Scholar] [CrossRef] [PubMed]
[9] Suvà, M.L. and Tirosh, I. (2019) Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges. Molecular Cell, 75, 7-12. [Google Scholar] [CrossRef] [PubMed]
[10] 吴德亮. 基于降维与聚类的单细胞RNA测序数据分析[D]: [硕士学位论文]. 哈尔滨: 哈尔滨工业大学, 2018.