K-Means算法的研究分析及改进

doi:10.12677/CSA.2016.69069

期刊菜单

K-Means算法的研究分析及改进
Research on K-Means Algorithm Analysis and Improvement

DOI: 10.12677/CSA.2016.69069, PDF, HTML, XML, 下载: 2,396 浏览: 3,989 科研立项经费支持
作者: 藏传宇, 沈勇, 张宇昊, 陈长庚, 张浩, 杨真谛：云南大学软件学院，云南昆明
关键词: 机器学习；聚类分析；K-Means算法；p-K-means算法；Machine Learning； Cluster Analysis； K-Means Algorithm； p-K-Means Algorithm

摘要: 传统的k-means算法采用的是随机数初始化聚类中心的方法，这种方法的主要优点是能够快速的产生初始化的聚类中心，其主要缺点是初始化的聚类中心可能会同时出现在同一个类别中，导致迭代次数过多，甚至陷入局部最优出现错误的聚类结果。针对传统的k-means算法初始聚类中心的缺点，本文提出了p-K-means算法，该算法采用了数学几何距离的方法改进k-means算法中初始聚类中心分布不均匀的现象多个聚类中心出现在同一类簇中的现象，这种方法能避免k-means聚类算法聚类过程中陷入局部最优，另一方面降低了聚类过程中的反复迭代次数。本文通过实验的方式来对两个算法进行分析比较后发现改进的算法在收敛速度上优于传统k-means算法，也不容易陷入局部最优。

Abstract: Traditional k-means algorithm uses a random number to initialize the cluster center, the main advantage of this method is the ability to quickly produce cluster center initialization, its main drawback is initializing cluster centers may appear in the same a category, leading to excessive iterations, errors and even local optimum clustering result. For the shortcomings of traditional k-means algorithm initial cluster centers, this paper presents the pK-means algorithm, which uses a mathematical geometric distance method for improving the k-means clustering phenomenon of multiple algorithms initial cluster centers unevenly distributed Center appear in the same class cluster phenomenon, this approach avoids k-means clustering algorithm clustering process into local optimization, on the other hand reduces the clustering process repeated iterations. After analyzing and comparing two algorithm experimentally, the article found that the improved algo-rithm is better than the traditional k-means algorithm converges quickly, not easy to fall into local optimum.

文章引用：藏传宇, 沈勇, 张宇昊, 陈长庚, 张浩, 杨真谛. K-Means算法的研究分析及改进[J]. 计算机科学与应用, 2016, 6(9): 551-564. http://dx.doi.org/10.12677/CSA.2016.69069

参考文献

[1]	哈林顿. 机器学习实战[M]. 北京: 人民邮电出版社, 2013.
[2]	于孝美. 基于半监督学习的算法研究改进[D]: [硕士学位论文]. 济南: 济南大学, 2013.
[3]	李卫军. K-means聚类算法的研究综述[J]. 现代计算机(专业版), 2014(8): 85-89.
[4]	黄静. 基于改进K-means算法的蚕茧自动计数方法的研究[J]. 丝绸, 2014(3): 35-41.
[5]	崔丹丹. K-means聚类算法的研究与改进[D]: [硕士学位论文]. 合肥: 安徽大学, 2012.
[6]	周鑫. K-means算法的研究与改进[J]. 微计算机信息, 2008(10): 31-33.
[7]	施瓦茨. 深入理解机器学习[M]. 北京: 机械工业出版社, 2016.
[8]	埃塞姆. 机器学习导论[M]. 北京: 机械工业出版社, 2016.

为你推荐

友情链接