基于BERT的子词级中文文本分类方法
Subword-Level Chinese Text Classification Method Based on BERT
DOI: 10.12677/CSA.2020.106112, PDF,  被引量    科研立项经费支持
作者: 李思锐*:成都信息工程大学,计算机学院,四川 成都
关键词: BERT模型子词级文本分类遮蔽语言模型BERT Model Subword Level Text Classification Masked Language Model
摘要: 随着时代的发展,网络中文本数量飞速增长,为了高效地提取和处理,对文本进行分类必不可少。该文以BERT模型为基础,提出了一种子词级的中文文本分类方法。在该方法中,使用子词级遮蔽方法改进原有遮蔽语言模型,使其能有效遮蔽完整中文单词,增加了BERT模型对中文文本的词向量表达能力。同时新加入了中文单词位置嵌入,弥补了BERT模型对中文单词位置信息的缺失。实验结果表明,使用了该文文本分类方法的BERT模型,在多个中文数据集中对比其他模型均拥有最好的分类效果。
Abstract: With the development of the times, the number of text in the network is growing rapidly. In order to extract and process the text efficiently, it is necessary to classify the text. Based on the BERT model, this paper proposes a Chinese text classification method at the seed word level. In this method, the subword-level masking method is used to improve the original masking language model, so that it can effectively mask the complete Chinese words, and increase the word vector expression ability of BERT model for Chinese text. At the same time, Chinese word position embedding is added to make up for the lack of Chinese word position information in BERT model. The experimental results show that the BERT model of this text classification method has the best classification effect compared with other models in multiple Chinese data sets.
文章引用:李思锐. 基于BERT的子词级中文文本分类方法[J]. 计算机科学与应用, 2020, 10(6): 1075-1086. https://doi.org/10.12677/CSA.2020.106112

参考文献

[1] Hinton, G.E. and Salakhutdinov, R.R. (2006) Reducing the Dimensionality of Data with Neural Networks. Science, 313, 504-507. [Google Scholar] [CrossRef] [PubMed]
[2] Kim, Y., Jernite, Y., Sontag, D., et al. (2016) Charac-ter-Aware Neural Language Models. Thirtieth AAAI Conference on Artificial Intelligence, North America, March 2016.
https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12489
[3] Liu, P., Qiu, X. and Huang, X. (2016) Recurrent Neural Network for Text Classification with Multi-Task Learning.
[4] Joulin, A., Grave, E., Bojanowski, P., et al. (2016) Bag of Tricks for Efficient Text Classification.
[5] Szegedy, C., Liu, W., Jia, Y., et al. (2014) Going Deeper with Convolutions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 1-9. [Google Scholar] [CrossRef
[6] Peters, M.E., Ammar, W., Bhagavatula, C., et al. (2017) Semi-Supervised Sequence Tagging with Bidirectional Language Models. Proceedings of the 55th Annual Meet-ing of the Association for Computational Linguistics, Volume 1, 1756-1765. [Google Scholar] [CrossRef
[7] Peters, M.E., Neumann, M., Iyyer, M., et al. (2018) Deep Contextual-ized Word Representations.
[8] Radford, A., Narasimhan, K., Salimans, T., et al. (2018) Improving Language Under-standing by Generative Pre-Training.
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
[9] Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Advances in Neural Information Processing Systems, Long Beach, CA, 2017, 5998-6008.
[10] Williams, A., Nangia, N. and Bowman, S.R. (2017) A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Tech-nologies, Volume 1, 1112-1122. [Google Scholar] [CrossRef
[11] Rajpurkar, P., Zhang, J., Lopyrev, K., et al. (2016) SQuAD: 100,000+ Questions for Machine Comprehension of Text. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, November 2016, 2383-2392. [Google Scholar] [CrossRef
[12] Devlin, J., Chang, M.W., Lee, K., et al. (2018) BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.
[13] Taylor, W.L. (1953) “Cloze Procedure”: A New Tool for Measuring Readability. Journalism Quarterly, 30, 415-433. [Google Scholar] [CrossRef
[14] Xie, Z., Wang, S.I., Li, J., et al. (2017) Data Noising as Smoothing in Neural Network Language Models.
[15] Liu, X., He, P., Chen, W., et al. (2019) Multi-Task Deep Neural Networks for Natural Language Understanding. Proceedings of the 57th Annual Meeting of the Association for Compu-tational Linguistics, Florence, July 2019, 4487-4496. [Google Scholar] [CrossRef
[16] Sun, C., Huang, L. and Qiu, X. (2019) Utilizing BERT for As-pect-Based Sentiment Analysis via Constructing Auxiliary Sentence.
[17] Sun, C., Qiu, X., Xu, Y., et al. (2019) How to Fine-Tune BERT for Text Classification? In: China National Conference on Chinese Computational Linguistics, Spring-er, Cham, 194-206. [Google Scholar] [CrossRef
[18] Wu, Y., Schuster, M., Chen, Z., et al. (2016) Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.
[19] 李思锐. 字符级全卷积神经网络的文本分类方法[J]. 计算机科学与应用, 2020, 10(2): Paper ID 34199.