Blansche, A., Gancarski, P. and Korczak, J.J. (2006) A Modular Approach for Clustering with Local Attribute Weighting. Pattern Recognition Letters, 27, 1299-1306.参考文献|汉斯出版社

学术期刊

在线客服：
对外合作：

联系：400-6379-560
feedback@hanspub.org

客服号

人工客服，稿件咨询

公众号

科技成果分享

文章引用说明更多>> (返回到该文章)

Blansche, A., Gancarski, P. and Korczak, J.J. (2006) A Modular Approach for Clustering with Local Attribute Weighting. Pattern Recognition Letters, 27, 1299-1306.

被以下文章引用:

标题: 高效朴素贝叶斯Web新闻文本分类模型的简易实现The Simply Implement of Effective Naive Bayes Web News Text Classification Model

作者: 吴致晖, 刘洪伟, 陈丽

关键字: 文本分类, 特征选择, 朴素贝叶斯, TF-IDF标准Text Classification; Feature Selection; Naive Bayes; TF-IDF Standard

期刊名称: 《Statistics and Application》, Vol.3 No.1, 2014-03-28

摘要: 采用朴素贝叶斯算法作为文本分类算法时，因其每个特征出现概率相互独立且每个特征重要程度相等的假设，所以选择一种高效的特征选择方法显得尤为重要。本文运用jieba中文分词模块的TF-IDF标准[1]对训练新闻文本进行特征选择，实现一个基于朴素贝叶斯的文本分类模型。对待分类新闻文本也同样用该TF-IDF标准来提取文本关键词再进行分类测试，实验测试结果表明有相当高的分类效率。 When using Naive Bayes theory as a text classification algorithm, it is especially important to choose an effetive feature selection method, due to the hypothesis that occurrence probabilities of features are independent of each other which is equally important. In this paper, jieba Chinese segmentation module’s TF-IDF standard is used to select the features for the training news text and Naive Bayes text classification model is implemented with high performance. Before the test of classification model, it’s still necessary to use the TF-IDF standard to select thekeywords for testing news texts. The experiment result showed that this method is of high efficiency inclassification.

在线客服：
对外合作：

联系方式：400-6379-560
投诉建议：feedback@hanspub.org

客服号

人工客服，优惠资讯，稿件咨询

公众号

科技前沿与学术知识分享

文章引用说明 更多>> (返回到该文章)

文章引用说明更多>> (返回到该文章)