基于麻雀搜索算法改进的密度峰值聚类算法
Improved Density Peak Clustering Algorithm Based on Sparrow Search Algorithm
摘要: 针对密度峰值聚类算法(Density Peaks Clustering Algorithm, DPC)用传统距离度量方式不能很好地反映数据分布,人为选取截断距离参数主观性较强等问题,设计了一种基于麻雀搜索算法改进的密度峰值聚类算法(Improved Density Peak Clustering Algorithm Based on Sparrow Search Algorithm, SSA-DPC)。该算法从两个方面进行改进:改变数据间的距离度量方式,用标准欧氏距离替代原算法中的欧氏距离;利用麻雀搜索算法(Sparrow Search Algorithm, SSA)较强的全局寻优能力,搜寻最佳截断距离值。通过对7个数据集进行仿真测试,证明SSA-DPC算法在3个评价指标上均优于其他聚类算法,提升了聚类性能,说明了算法的有效性。
Abstract: Aiming at the problems that density peaks clustering algorithm (DPC) cannot well reflect the data distribution with traditional distance measurement, and the artificial selection of truncation dis-tance parameters is highly subjective, an improved density peak based on sparrow search algo-rithm was designed—Clustering algorithm (Improved Density Peak Clustering Algorithm Based on Sparrow Search Algorithm, SSA-DPC). The algorithm is improved from two aspects: change the distance measurement method between data, and replace the Euclidean distance in the original algorithm with the standard Euclidean distance; using the strong global optimization ability of the Sparrow Search Algorithm (SSA), the best cutoff distance value was searched. Through the simula-tion test of 7 data sets, it is proved that the SSA-DPC algorithm is superior to other clustering algo-rithms in 3 evaluation indicators, and the clustering performance is improved, which shows the effectiveness of the algorithm.
文章引用:何婷霭, 李秦. 基于麻雀搜索算法改进的密度峰值聚类算法[J]. 理论数学, 2022, 12(10): 1669-1678. https://doi.org/10.12677/PM.2022.1210181

参考文献

[1] Han, J., Pei, J. and Kamber, M. (2011) Data Mining: Concepts and Techniques. Elsevier, Amsterdam.
[2] Xu, R. and Wunsch, D. (2005) Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16, 645-678. [Google Scholar] [CrossRef
[3] Bai, L., Cheng, X.Q., Liang, J.Y., Shen, H.W. and Guo, Y.K. (2017) Fast Density Clustering Strategies Based on the k-Means Algorithm. Pattern Recognition, 71, 375-386. [Google Scholar] [CrossRef
[4] Xu, D. and Tian, Y. (2015) A Comprehensive Survey of Clus-tering Algorithms. Annals of Data Science, 2, 165-193. [Google Scholar] [CrossRef
[5] Xie, J.Y., Gao, H.C., Xie, W.X., Liu, X.H. and Grant, P.W. (2016) Robust Clustering by Detecting Density Peaks and Assigning points Based on Fuzzy Weighted K-Nearest Neighbors. Information Sciences, 354, 19-40. [Google Scholar] [CrossRef
[6] Rodriguez, A. and Laio, A. (2014) Clustering by Fast Search and Find of Density peaks. Science, 344, 1492-1496. [Google Scholar] [CrossRef] [PubMed]
[7] Morris, K. and McNicholas, P.D. (2016) Clustering, Classification, Discriminant Analysis, and Dimension Reduction via Generalized Hyperbolic Mixtures. Computational Statistics & Data Analysis, 97, 133-150. [Google Scholar] [CrossRef
[8] 蒋礼青, 张明新, 郑金龙, 戴娇, 尚赵伟. 快速搜索与发现密度峰值聚类算法的优化研究[J]. 计算机应用研究, 2016, 33(11): 3251-3254.
[9] Jian, H. and Xu, E. (2017) An Improved Density Peak Clustering Algorithm. International Conference on Intelligent Data Engineering and Automated Learning, Guilin, 30 October-1 November 2017, 211-221. [Google Scholar] [CrossRef
[10] Xue, J. and Shen, B. (2020) A Novel Swarm Intelligence Optimization Approach: Sparrow Search Algorithm. Systems Science & Control Engineering, 8, 22-34. [Google Scholar] [CrossRef
[11] 陈俊芬, 张明, 赵佳成. 复杂高维数据的密度峰值快速搜索聚类算法[J]. 计算机科学, 2020, 47(3): 79-86.
[12] 薛建凯. 一种新型的群智能优化技术的研究与应用[D]: [硕士学位论文]. 上海: 东华大学, 2020.
[13] Strehl, A. and Ghosh, J. (2002) Cluster Ensembles—A Knowledge Reuse Framework for Combining Multiple Partitions. Journal of Machine Learning Research, 3, 583-617.
[14] Viola, P. and Wells III, W.M. (1997) Alignment by Maximization of Mutual Information. International Journal of Computer Vision, 24, 137-154. [Google Scholar] [CrossRef
[15] Hubert, L. and Arabie, P. (1985) Comparing Partitions. Journal of Classification, 2, 193-218. [Google Scholar] [CrossRef
[16] Fowlkes, E.B. and Mallows, C.L. (1983) A Method for Comparing Two Hierarchical Clusterings. Journal of the American Statistical Association, 78, 553-569. [Google Scholar] [CrossRef