考虑了特征协同作用的FAST特征选择算法的改进
A Kind of FAST Feature Selection Algorithm Considering Feature Interaction
摘要: 交互的特征是指那些分开考虑对目标集不相关或弱相关,但合在一起考虑却对目标集高度相关的特征。特征交互现象广泛存在,但找出有交互作用的特征却是一项具有挑战性的任务。本文旨在对基于聚类的FAST特征选择算法进行改进,在其基础上考虑特征的交互作用,首先去掉FAST的移除不相关特征的部分,接着加入交互权值变量,使得在移除不相关和冗余特征的同时,保留有交互作用的特征。为了对两个算法进行对比分析,我们选取了5个不同领域的16个公开数据集进行实证分析,并使用4种分类器对实验结果进行评估,包括C5.0、Bayes Net、Neural Net和Logistic,接着从选择的特征个数、算法运行时间和分类器的准确率3个方面对两个算法进行比较。实验结果表明,两者选择的特征个数相差不大,有时IWFAST甚至可以减少特征个数,同时IWFAST能提高分类器的准确率,尤其对于特征数量较多的情形,以及Game和Life领域。美中不足的是,IWFAST的运行时间较长,但仍在可接受的范围内。
Abstract: Interacting features are those appear to be irrelevant or weakly relevant with the class individually, but it may highly correlate to the class when it combined. Feature interaction is almost everywhere, but discovering feature interaction is a challenging task in feature selection. The purpose of this paper is to improve the FAST feature selection algorithm based on cluster by considering feature interaction. Firstly, deleted the irrelevant feature removal section, then brought in an interaction weight factor, so that we can retain interacted features when removed the irrelevant and redundant ones. In order to do the comparison between this two algorithms, we selected 16 public data sets which cover 5 different domains on the empirical analysis, and used 4 types of classifier to evaluate the results, namely, C5.0, Bayes Net, Neural Net and Logistic. Finally, we compared these two algorithms according to the number of selected features, running time of algorithm and the accuracy of classifier. The experimental result showed that it has little difference on the number of selected features, and sometimes IWFAST can produce smaller subsets of features. Meanwhile, IWFAST can improve the accuracy of the classifier, especially for the high- dimensional data set, or especially for the Game and Life area. The defect is that the running time of IWFAST is long, but is acceptable computational complexity.
文章引用:陆碧云, 张磊. 考虑了特征协同作用的FAST特征选择算法的改进[J]. 数据挖掘, 2017, 7(2): 51-63. https://doi.org/10.12677/HJDM.2017.72006

参考文献

[1] Guyon, I. and Elisseeff, A. (2003) An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 3, 1157-1182.
[2] Pereira, F. Tishby, N. and Lee, L. (2007) Distributional Clustering of English Words. Proceedings of Sixth International Conference on Advanced Language Processing and Web Information, Luoyang, 22-24 August 2007, 123-128.
[3] Baker, L.D. and McCallum, A.K. (1998) Distributional Clustering of Words for Text Classification. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, 24-28 August 1998, 96-103.
https://doi.org/10.1145/290941.290970
[4] Dhillon, I.S., Mallela, S. and Kumar, R. (2003) A Divisive Infor-mation Theoretic Feature Clustering Algorithm for Text Classification. Machine Learning Research, 3, 1265-1287.
[5] Song, Q.B., Ni, J.J. and Wang, G.T. (2013) A Fast Clustering-Based Feature Subset Selection Algo-rithm for High-Dimensional Data. IEEE Transactions on Knowledge and Data Engineering, 25, 1.
https://doi.org/10.1109/TKDE.2011.181
[6] Jakulin, A. and Bratko, I. (2003) Analying Attribute Dependencies. Proceedings of Seventh European Conference on Principles and Practice of Knowledge Discovery in Databases, Cavtat-Dubrovnik, 22-26 September 2003, 229-240.
[7] Zeng, Z.L., Zhang, H.J., Zhang, R. and Yin, C.X. (2015) A Novel Feature Selection Method Considering Feature Interaction. Pattern Recognition, 48, 2656-2666.
https://doi.org/10.1016/j.patcog.2015.02.025
[8] Butterworth, R., Piatetcky-Shapiro, G. and Simovici, D.A. (2005) On Feature Selection through Clustering. Proceedings of the Fifth IEEE International Conference on Data Mining, Washington DC, 27-30 November 2005, 581-84.
https://doi.org/10.1109/ICDM.2005.106
[9] Van Dijck, G. and Van Hulle, M.M. (2006) Speeding Up the Wrapper Feature Subset Selection in Regression by Mutual Information Relevance and Redundancy Analysis. 16th In-ternational Conference: Artificial Neural Networks. Athens, 10-14 September 2006.
[10] Krier, C., Francois, D., Rossi, F. and Verleysen, M. (2007) Feature Clustering and Mutual Information for the Selection of Variables in Spectral Data. Proceedings of European Symposium on Artificial Neural Networks, Bruges, 25-27 April 2007, 157-162.
[11] Zhao, Z. and Liu, H. (2013) Searching for Interacting Features in Subset Selection. Intelligent Data Analysis, 13, 207- 228.
[12] Wang, G., Song, Q., Xu, B. and Zhou, Y. (2013) Selecting Feature Subset for High Dimension Data via the Propositional FOIL Rules. Pattern Recognition, 46, 199-214.
https://doi.org/10.1016/j.patcog.2012.07.028
[13] Jakulin, A. and Bratko, I. (2004) Testing the Significance of Attribute Interactions. Proceedings of the Twenty-First International Conference on Machine Learning, Banff, 4-8 July 2004, 409-416.
https://doi.org/10.1145/1015330.1015377
[14] Jakulin, A. (2003) Attribute Interactions in Machine Learning. Master Thesis, University of Liubljana, Kongresni.
[15] John, G.H., Kohavi, R. and Fleger, K.P. (2012) Irrelevant Features and the Subset Selection Problem. Proceedings of the Eleventh International Conference on Machine Learning, Boca Raton, 12-15 December 2012, 121-129.
[16] Prim, R.C. (1957) Shortest Connection Networks and Some Generalizations. Bell System Technical Journal, 36, 1389- 1401.
https://doi.org/10.1002/j.1538-7305.1957.tb01515.x