基于联合粒度属性约简信息损失的研究
Research on Information Loss of Attribute Reduction Based on Joint Granularity
摘要:
随着互联网技术的迅速发展,社会进入了大数据时代。数据不仅类型多种多样,结构错综复杂还具有动态变化的特点。如何从海量数据中快速获取有价值的信息是当前亟待解决的问题。粗糙集是一种处理数据不确定性的数据评价方法。属性约简是粗糙集理论的一个重要核心应用。本文将围绕属性约简后信息损失量进行研究,从而找寻一种属性约简算法,在约简后既能保持数据分类准确率较高且信息损失较少。本文借助知识粒度的概念和约简算法,引入联合粒度,并将其运用到属性约简过程,进一步得出基于联合粒度属性约简算法。然后运用其算法对决策表系统进行约简,得出该算法在保持分类准确率不变的情况下,其信息损失量降至较低。最后通过UCI数据集进行仿真实验探究,从而验证了该方法的准确性和有效性。
Abstract:
With the rapid development of Internet technology, the society has entered the era of big data. The data is not only of various types and structures, but also of dynamic change. How to quickly obtain valuable information from massive data is an urgent problem to be solved. Rough set is a data evaluation method to deal with data uncertainty. Attribute reduction is an important core application of rough set theory. This paper will focus on the amount of information loss after attribute reduction, so as to find an attribute reduction algorithm, which can keep the data classification accuracy higher and information loss less after reduction. In this paper, the concept of knowledge granularity and reduction algorithm, the introduction of joint granularity, and its application to the process of attribute reduction, further get the attribute reduction algorithm based on joint granularity. Then the algorithm is used to reduce the decision table system. It is concluded that the information loss of the algorithm is reduced to a low level while the classification accuracy remains unchanged. Finally, the accuracy and effectiveness of this method are verified by the simulation experiment of UCI data set.
参考文献
|
[1]
|
Pawlak, Z. (1982) Rough Sets. International Journal of Computer and Information Sciences, 11, 341-356. [Google Scholar] [CrossRef]
|
|
[2]
|
Hobbs, J.R. (1985) Granularity. Proceedings of the Ninth International Joint Conference on Artificial Intelligence, Los Angeles, 432-435.
|
|
[3]
|
Lin, T.Y. (1997) Granular Computing. An-nouncement of the BASIC Special Interest Group on Granular Computing.
|
|
[4]
|
Zhang, C.C. (2020) Knowledge Granu-larity Based Incremental Attribute Reduction for Incomplete Decision Systems. International Journal of Machine Learn-ing and Cybernetics, 11, 1141-1157. [Google Scholar] [CrossRef]
|
|
[5]
|
李旭, 等. 带权决策表的属性约简[J]. 计算机工程与应用, 2020, 56(12): 54-59.
|
|
[6]
|
大数据背景下粗糙集属性约简研究进展[J]. 计算机工程与应用, 2019, 55(6): 31-38.
|
|
[7]
|
基于知识粒化的信息系统增量式属性约简[J]. 模式识别与人工智能, 2019, 38(8): 31-38.
|
|
[8]
|
一种基于知识粒度的启发式属性约简算法[J]. 计算机工程与应用, 2012, 48(36): 31-38.
|
|
[9]
|
邓大勇, 薛欢欢, 苗夺谦, 卢克文. 属性约简准则与约简信息损失的研究[J]. 电子学报, 2017, 45(2): 401-407.
|
|
[10]
|
王国胤. Rough集理论与知识获取[M]. 西安: 西安交通大学出版社, 2001.
|
|
[11]
|
腾书华. 基于粗糙集理论的不确定性度量和属性约简方法研究[D]: [博士学位论文]. 长沙: 国防科学技术大学, 2010.
|
|
[12]
|
桑妍丽, 钱宇华. 多粒度决策粗糙集中的粒度约简方法[J]. 计算机科学, 2017, 44(5): 199-205.
|
|
[13]
|
桑妍丽, 钱宇华. 一种悲观多粒度粗糙集中的粒度约简算法[J]. 模式识别与人工智能, 2012, 25(3): 361-366.
|
|
[14]
|
邓大勇, 黄厚宽. 多粒度粗糙集的双层绝对约简[J]. 模式识别与人工智能, 2016, 29(11): 969-975.
|
|
[15]
|
苗夺谦, 李道国. 粗糙集理论、算法与应用[M]. 北京: 清华大学出版社, 2008: 4.
|