聚类联合关联规则的数据挖掘技术
The Combining Technology of Data Mining Based on Clustering and Association Rules
DOI: 10.12677/ORF.2017.74018, PDF,  被引量    科研立项经费支持
作者: 李涵, 张东生*:河南大学软件学院,河南 开封
关键词: 聚类关联规则数据挖掘机器学习Clustering Association Rules Data Mining Machine Learning
摘要: 尽管聚类分析和关联规则作为两个主要应用方法都可以实现数据挖掘功能,但两者存在三大差异,聚类的数据类型为连续型,关联规则为离散型;聚类体现挖掘的描述功能,关联规则体现预测/验证功能;聚类的输出形式为类簇,关联规则输出的是规则。两者同时具有一定的互补性。因此,本文将两者结合起来,先对样本集进行聚类分析,使样本实体获得各自的类别信息;再对这些带有分类属性的样本进行关联规则挖掘,使得挖掘运算有效降维且具有更好的挖掘目标,挖掘结果可以清晰地显示聚类形成的原因和聚类之间的关系等潜在知识。实验表明,本文介绍的联合挖掘技术可以取得更好的挖掘效果,具有很大的实用价值。
Abstract: Although clustering analysis and association rules as two main application methods can achieve data mining, but both two methods have three different. The data type of clustering operation is continuous and association rules are discrete. Clustering reflects the description function of the mining and association rules reflect prediction/validation function. The output form of clustering is clusters, and association rules then output the lines of rule. At the same time, both of them have some complementary to each other. So, this paper combined the both methods. The clustering analysis for the set of samples was first executed. This processing will make samples for their respective category entity information. Then, run association rules mining according to the samples what with classification properties. The method show the potential knowledge further including causes of the formation of clustering and the relationship between clusters. The experiment shows that the mining technology has better effect and great value of application.
文章引用:李涵, 张东生. 聚类联合关联规则的数据挖掘技术[J]. 运筹与模糊学, 2017, 7(4): 170-176. https://doi.org/10.12677/ORF.2017.74018

参考文献

[1] 陈安, 陈宁, 周龙骧. 数据挖掘技术及应用[M]. 北京: 科学出版社, 2006.
[2] Agrawal, R., Imielinski, T. and Swami, A. (1993) Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering, 5, 914-925. [Google Scholar] [CrossRef
[3] 夏姜虹. 数据挖掘技术的常用方法分析[J]. 云南大学学报(自然科学版), 2011, 33(S2): 173-175.
[4] 张连育, 吕立. 基于策略模式的中医数据挖掘平台的设计与研究[J]. 小型微型计算机系统, 2011, 32(7): 1406- 1411.
[5] 孙中祥, 彭湘君, 杨玉平, 贺一. 数据挖掘在教育教学中的应用综述[J]. 2012, 2(1): 78-80.
[6] 戴汝为. 社会智能科学[M]. 上海: 上海交通大学出版社, 2007.
[7] 张东生, 王永强, 苏靖, 等. 模糊聚类与数据挖掘在数据分析中的应用[J]. 运筹与模糊学, 2016, 6(4): 7
[8] Agrawal, R. and Srikant, R. (1995) Mining Sequential Patterns. 1995 Proceedings of the Eleventh International Conference on Data Engineering, Taipei, 6-10 March 1995, 3-14. [Google Scholar] [CrossRef
[9] 张东生. 基于模糊聚类的考试分析方法[J]. 电脑知识与技术, 2009, 5(33): 9579-9580.
[10] 李雪梅, 张素琴. 数据挖掘中聚类分析技术的应用[J]. 武汉大学学报(工学版), 2009, 42(3): 396-399.
[11] 徐辉增. 关联规则数据挖掘方法的研究[J]. 科学技术与工程, 2012, 12(1): 60-63.
[12] 王爱平, 王占凤, 陶嗣干, 等. 数据挖掘中常用关联规则挖掘算法[J]. 计算机技术与发展, 2010, 20(4): 105-108.
[13] 张东生, 季超. 动态模糊聚类及最佳聚类效果研究[C]. Proceedings of Chinese Conference on Pattern Recognition (CCPR), Beijing, 4-6 November 2009.
[14] Zhang, D.S., Li, S.Z. and Wei, W. (2010) Visual Clustering Methods with Feature Displayed Function for Self-Organizing. Industrial Mechatronics and Automation. [Google Scholar] [CrossRef
[15] 郭涛, 张代远. 基于关联规则数据挖掘Apriori算法的研究与应用[J]. 计算机技术与发展, 2011, 21(6): 101-103.
[16] 武森, 俞晓莉, 倪宇, 王瑞峰. 数据挖掘中的聚类技术在学生成绩分析中的应用[J]. 中国管理信息化, 2009, 12(15): 45-47.