类依赖实例加权朴素贝叶斯算法研究
Study on Class-Specific Instance Weighted Naive Bayes
DOI: 10.12677/AAM.2023.1210423, PDF,    国家自然科学基金支持
作者: 曾嘉琪, 彭 萍*, 杨 柳, 胡桂开:东华理工大学理学院,江西 南昌
关键词: 朴素贝叶斯实例加权类依赖分类器Naive Bayes Case Weighting Class-Specific Classifier
摘要: 为削弱朴素贝叶斯中属性条件独立性假设的影响,人们提出了许多改进朴素贝叶斯的方法,实例加权是一个重要的改进方向,但现有实例权重构造是将训练样本作为一个整体进行处理,没有考虑类内实例的分布情况。因此,本文提出两种类依赖实例加权朴素贝叶斯算法:基于相关性的类依赖实例加权朴素贝叶斯(CCSIWNB)和类依赖属性值频率实例加权朴素贝叶斯(CSAVFWNB)。关于CCSIWNB,实例权重是在计算类内每个实例与该类众数实例相似度后,消除该实例与其它类众数实例的平均相似度基础上得到的。关于CSAVFWNB,实例的权重是由类内属性值频率向量和该类属性值个数向量的内积得到的。最后,采用标准UCI数据集将CCSIWNB、CSAVFWNB与朴素贝叶斯算法和其它实例加权朴素贝叶斯算法进行仿真实验,结果表明本文提出的算法在准确率上优于其它算法。
Abstract: In order to weaken the influence of attribute conditional independence hypothesis in naive Bayes, many improved naive Bayes methods have been proposed, and instance weighting is an important improvement direction. However, the existing instance weight construction considers the training sample as a whole, without considering the distribution of instances in the class. Therefore, two kinds of class-specific weighted naive Bayes algorithms are proposed in this paper: correlation- based class-specific instance weighted Naive Bayes (CCSIWNB) and class-specific attribute value frequency instance weighted Naive Bayes (CSAVFWNB). About CCSIWNB, the weight is obtained on the basis of calculating the similarity between each instance of certain class and the mode instance of the same class, and eliminating the average similarity between the instance and the mode in-stances of the other class. For CSAVFWNB, the weight of each instance is the inner product of the at-tribute value frequency vector and the attribute value number vector in the same class. Finally, CCSIWNB and CSAVFWNB are simulated with naive Bayes algorithm and other case weighted naive Bayes algorithm using standard UCI data set. The results show that the proposed algorithm is supe-rior to other algorithms in accuracy.
文章引用:曾嘉琪, 彭萍, 杨柳, 胡桂开. 类依赖实例加权朴素贝叶斯算法研究[J]. 应用数学进展, 2023, 12(10): 4300-4309. https://doi.org/10.12677/AAM.2023.1210423

参考文献

[1] Friedman, N., Geiger, D. and Goldszmidt, M. (1997) Bayesian Network Classifiers. Machine Learning, 29, 131-163. [Google Scholar] [CrossRef
[2] Webb, G.I., Boughton, J.R. and Wang, Z. (2005) Not So Naive Bayes: Aggre-gating One-Dependence Estimators. Machine Learning, 58, 5-24. [Google Scholar] [CrossRef
[3] Jiang, L., Zhang, H. and Cai, Z. (2008) A Novel Bayes Model: Hidden Naive Bayes. IEEE Transactions on Knowledge and Data Engineering, 21, 1361-1371. [Google Scholar] [CrossRef
[4] Wu, J., Pan, S., Zhu, X., et al. (2016) SODE: Self-Adaptive One-Dependence Estimators for Classification. Pattern Recognition, 51, 358-377. [Google Scholar] [CrossRef
[5] Harzevili, N.S. and Alizadeh, S.H. (2018) Mixture of Latent Multinomial Naive Bayes Classifier. Applied Soft Computing, 69, 516-527. [Google Scholar] [CrossRef
[6] Yu, L., Jiang, L., Wang, D., et al. (2017) Attribute Value Weighted Average of One-Dependence Estimators. Entropy, 19, 501-517. [Google Scholar] [CrossRef
[7] Jiang, L., Zhang, L., Li, C., et al. (2018) A Correlation-Based Feature Weighting Filter for Naive Bayes. IEEE Transactions on Knowledge and Data Engineering, 31, 201-213. [Google Scholar] [CrossRef
[8] Zhang, H. and Sheng, S. (2004) Learning Weighted Naive Bayes with Accu-rate Ranking. 4th IEEE International Conference on Data Mining (ICDM’04), Brighton, 1-4 November 2004, 567-570.
[9] Hall, M. (2006) A Decision Tree-Based Attribute Weighting Filter for Naive Bayes. Knowledge-Based Systems, 20, 59-70. [Google Scholar] [CrossRef
[10] Jiang, L. and Zhang, H. (2006) Learning Naive Bayes for Probability Estima-tion by Feature Selection. Advances in Artificial Intelligence: 19th Conference of the Canadian Society for Computational Studies of Intelligence, Canadian AI 2006, Québec City, 7-9 June 2006, 503-514. [Google Scholar] [CrossRef
[11] El Hindi, K. (2014) Fine Tuning the Naïve Bayesian Learning Algorithm. AI Communications, 27, 133-141. [Google Scholar] [CrossRef
[12] Zhang, H. and Jiang, L. (2022) Fine Tuning Attribute Weighted Naive Bayes. Neuro-computing, 488, 402-411. [Google Scholar] [CrossRef
[13] Hindi, E.M.K., Aljulaidan, R.R. and AlSalman, H. (2020) Lazy Fine-Tuning Algorithms for Naïve Bayesian Text Classification. Applied Soft Computing Journal, 96, Article ID: 106652. [Google Scholar] [CrossRef
[14] Diab, M.D. and Hindi, E.M.K. (2016) Using Differential Evolution for Fine Tuning Naïve Bayesian Classifiers and Its Application for Text Classification. Applied Soft Computing, 54, 183-199. [Google Scholar] [CrossRef
[15] Hindi, E.K. (2018) Combining Instance Weighting and Fine Tuning for Training Naïve Bayesian Classifiers with Scant Training Data. The International Arab Journal of Information Technology, 15, 1099-1106.
[16] Xie, Z., Hsu, W., Liu, Z., et al. (2002) Snnb: A Selective Neighborhood Based Naive Bayes for Lazy Learning. Ad-vances in Knowledge Discovery and Data Mining: 6th Pacific-Asia Conference, PAKDD 2002, Taipei, 6-8 May 2002, 104-114. [Google Scholar] [CrossRef
[17] Frank, E., Hall, M. and Pfahringer, B. (2012) Locally Weighted Naive Bayes.
[18] Jiang, L., et al. (2010) Improving Naive Bayes for Classification. International Journal of Computers & Applications, 32, 328-332. [Google Scholar] [CrossRef
[19] Xu, W., Jiang, L. and Yu, L. (2019) An Attribute Value Frequency-Based Instance Weighting Filter for Naive Bayes. Journal of Experimental & Theoretical Artificial Intelligence, 31, 225-236. [Google Scholar] [CrossRef
[20] 杨柳, 胡桂开, 彭萍, 曾嘉琪. 嵌入属性加权的实例加权朴素贝叶斯算法[J]. 应用数学进展, 2023, 12(5): 2392-2401.