基于流形学习的全球创新指数降维与聚类分析
Dimensionality Reduction and Clustering Analysis of the Global Innovation Index Based on Manifold Learning
DOI: 10.12677/sa.2026.153051, PDF,    科研立项经费支持
作者: 何飞雪:重庆工商大学数学与统计学院,重庆
关键词: 全球创新指数流形学习熵权法UMAP聚类分析TOPSISGlobal Innovation Index Manifold Learning Entropy Weight Method UMAP Cluster Analysis TOPSIS
摘要: 全球创新指数(Global Innovation Index, GII)作为衡量国家创新能力的高维多指标体系,其复杂性对深入分析与直观可视化构成了挑战。传统线性降维方法在处理其非线性数据结构时存在局限,而现有流形学习方法未充分考虑特征重要性差异。为此,本研究提出一种融合特征权重的改进UMAP方法,旨在更有效地揭示全球创新格局的内在结构与集群特征。以2013~2022年118个经济体的GII数据为基础,首先通过熵权法计算特征权重,并将其融入UMAP的距离度量中以构建加权降维模型;进而使用K-Means聚类,结合多种评估指标量化聚类效果,最终采用TOPSIS方法进行综合评价排序。实验结果显示,熵权UMAP在聚类数为5时取得最优综合性能,其TOPSIS排名第一,较采用的PCA降维方法具有更优的结构识别能力,为全球创新格局分析提供了更鲁棒的数据处理工具,也为类似多指标综合评价体系的降维与聚类问题提供了新的方法参考。
Abstract: As a high-dimensional, multi-indicator system for measuring national innovation capabilities, the Global Innovation Index (GII) poses challenges for in-depth analysis and intuitive visualization due to its complexity. Traditional linear dimensionality reduction methods have limitations in handling its non-linear data structure, while existing manifold learning approaches have not fully considered differences in feature importance. To address this, this study proposes an improved UMAP method that integrates feature weights, aiming to more effectively reveal the intrinsic structure and cluster characteristics of the global innovation landscape. Based on GII data from 118 economies between 2013 and 2022, feature weights are first calculated using the entropy weight method and incorporated into UMAP’s distance metric to construct a weighted dimensionality reduction model. Subsequently, K-Means clustering is applied, and multiple evaluation metrics are used to quantify clustering performance. Finally, the TOPSIS method is employed for comprehensive evaluation and ranking. Experimental results show that entropy-weighted UMAP achieves optimal comprehensive performance when the number of clusters is set to 5, ranking first in the TOPSIS evaluation. Compared to the PCA dimensionality reduction method used, it demonstrates superior structural recognition capabilities. This study provides a more robust data processing tool for analyzing the global innovation landscape and offers a new methodological reference for dimensionality reduction and clustering in similar multi-indicator comprehensive evaluation systems.
文章引用:何飞雪. 基于流形学习的全球创新指数降维与聚类分析[J]. 统计学与应用, 2026, 15(3): 12-25. https://doi.org/10.12677/sa.2026.153051

参考文献

[1] 《2025年全球创新指数》勾勒“强技术、慢落地”的全球创新脉搏[J]. 科技导报, 2025, 43(18): 6.
[2] 孙玛媛, 习怡衡, 王海燕. 从全球创新指数看中国创新能力——基于国家创新体系视角[J]. 经济体制改革, 2025(1): 164-173.
[3] Nasir, M.H. and Zhang, S. (2024) Evaluating Innovative Factors of the Global Innovation Index: A Panel Data Approach. Innovation and Green Development, 3, Article ID: 100096. [Google Scholar] [CrossRef
[4] Ma, X., Hao, Y., Li, X., Liu, J. and Qi, J. (2023) Evaluating Global Intelligence Innovation: An Index Based on Machine Learning Methods. Technological Forecasting and Social Change, 194, Article ID: 122736. [Google Scholar] [CrossRef
[5] Brás, G.R. (2023) Pillars of the Global Innovation Index by Income Level of Economies: Longitudinal Data (2011-2022) for Researchers’ Use. Data in Brief, 46, Article ID: 108818. [Google Scholar] [CrossRef] [PubMed]
[6] Huarng, K. and Yu, T.H. (2022) Analysis of Global Innovation Index by Structural Qualitative Association. Technological Forecasting and Social Change, 182, Article ID: 121850. [Google Scholar] [CrossRef
[7] Yu, T.H., Huarng, K. and Huang, D. (2021) Causal Complexity Analysis of the Global Innovation Index. Journal of Business Research, 137, 39-45. [Google Scholar] [CrossRef
[8] Crespo, N.F. and Crespo, C.F. (2016) Global Innovation Index: Moving beyond the Absolute Value of Ranking with a Fuzzy-Set Analysis. Journal of Business Research, 69, 5265-5271. [Google Scholar] [CrossRef
[9] El, B.R. and Maymoni, L. (2022) How Can Lower-Income Countries Integrate in the Innovation-Led Global Economy? International Journal of Innovation Studies, 6, 153-165.
[10] Eufrazio, E. and Costa, H. (2025) Comprehensive Dataset of Global Innovation Index Panel Data (2013-2022): Clustering with K-Means and Principal Component Analysis. Data in Brief, 63, Article ID: 112194. [Google Scholar] [CrossRef
[11] Allaoui, M., Belhaouari, S.B., Hedjam, R., Bouanane, K. and Kherfi, M.L. (2025) t-SNE-PSO: Optimizing t-SNE Using Particle Swarm Optimization. Expert Systems with Applications, 269, Article ID: 126398. [Google Scholar] [CrossRef
[12] 谢斌, 徐燕, 王冠超, 等. t-SNE最大化的自适应彩色图像灰度化方法[J]. 中国图象图形学报, 2024, 29(8): 2333-2349.
[13] 邹黎敏, 唐永欣. 基于机器学习的我国天然气进口量预测及其运输安全评价[J]. 工业技术经济, 2025, 44(2): 108-118.
[14] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[15] 周映宇, 何玲, 陈家兑, 等. 基于UMAP-KNN的公交车工况构建方法[J]. 机械设计与制造, 2025: 1-7.
[16] Chae, Y.H., Koo, S.R., Choi, J. and Kim, J. (2026) Enhanced Learning for Nuclear Power Plant Condition Diagnoses Using Information Metric Based on Bayesian Neural Networks and UMAP. Nuclear Engineering and Technology, 58, Article ID: 103886. [Google Scholar] [CrossRef
[17] Tan, H.S., Wang, K. and Mcbeth, R. (2024) Exploring UMAP in Hybrid Models of Entropy-Based and Representativeness Sampling for Active Learning in Biomedical Segmentation. Computers in Biology and Medicine, 176, Article ID: 108605. [Google Scholar] [CrossRef] [PubMed]
[18] Taghizadeh-Mehrjardi, R., Nabiollahi, K., Kebonye, N.M., Kakhani, N., Ghebleh-Goydaragh, M., Heung, B., et al. (2024) High-Performance Soil Class Delineation via UMAP Coupled with Machine Learning in Kurdistan Province, Iran. Geoderma Regional, 36, e00754. [Google Scholar] [CrossRef
[19] Dadu, A., Satone, V.K., Kaur, R., Koretsky, M.J., Iwaki, H., Qi, Y.A., et al. (2023) Application of Aligned-UMAP to Longitudinal Biomedical Studies. Patterns, 4, Article ID: 100741. [Google Scholar] [CrossRef] [PubMed]
[20] 黎耀康, 杨海东, 徐康康, 等. 基于加权UMAP和改进BLS的锂电池温度预测[J]. 储能科学与技术, 2024, 13(9): 3006-3015.
[21] 尹泽明, 王彩年, 王智, 等. 基于UMAP改进的多域特征提取方法及轴承故障诊断[J]. 组合机床与自动化加工技术, 2024(1): 160-163.
[22] 张润, 李晓斌, 徐亚敏. 一致流形逼近与投影算法综述[J/OL]. 计算机科学, 2025: 1-16.
https://link.cnki.net/urlid/50.1075.tp.20250707.1434.026, 2025-07-08.