字符级全卷积神经网络的文本分类方法
Character Level Full Convolution Neural Network for Text Classification
DOI: 10.12677/CSA.2020.102024, PDF,  被引量    科研立项经费支持
作者: 李思锐*:成都信息工程大学,计算机学院,四川 成都
关键词: 文本分类字符级全卷积神经网络全局平局池化层Text Classification Character Level Fully Convolutional Network Global Average Pooling
摘要: 为了解决传统卷积神经网络的全连接层参数过多,计算效率低的问题。该文将图像处理中使用的全卷积神经网络和全局平均池化层用于文本分类,将卷积层和全局平均池化层结合并替换全连接层,同时参照Inception结构使用多尺度的卷积核,减少了模型的参数数量,加快了模型的收敛速度,增加了模型的分类准确率。此外为了避免维度灾难和词级向量训练速度慢的问题,该文采用字符级进行向量化表示。并使用批量标准化层代替Dropout层,减少了过拟合问题。通过使用多指标在测试数据集中进行模型评估,充分验证了该模型的有效性,与传统模型相比,提出的模型在分类任务中具有更好的分类性能。
Abstract: In order to solve the problem of too many parameters in the full connection layer and low calculation efficiency of the traditional convolutional neural network, the full convolutional neural network and the global average pooling layer are used in image processing for text classification, the convolutional layer is combined with the global average pooled layer and the fully connected layer is replaced. Meanwhile, using the multi-scale convolution kernel with reference to the Inception structure reduces the number of parameters, speeds up the convergence, and increases the classification accuracy of the model. In addition, in order to avoid the curse of dimensionality and the slow speed of word level vector training, character level vector representation is used. And the batch standardization layer is used instead of the Dropout layer, reducing over-fitting problems. By using multiple indicators to evaluate the model in the test data set, the validity of the model is fully verified. Compared with the traditional model, the proposed model has better classification performance in the classification task.
文章引用:李思锐. 字符级全卷积神经网络的文本分类方法[J]. 计算机科学与应用, 2020, 10(2): 225-235. https://doi.org/10.12677/CSA.2020.102024

参考文献

[1] Hinton, G.E. and Salakhutdinov, R.R. (2006) Reducing the Dimensionality of Data with Neural Networks. Science, 313, 504-507. [Google Scholar] [CrossRef] [PubMed]
[2] He, K., Zhang, X., Ren, S., et al. (2016) Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 770-778. [Google Scholar] [CrossRef
[3] Abdelhamid, O., Mohamed, A., Jiang, H., et al. (2012) Applying Convolutional Neural Networks Concepts to Hybrid NN-HMM Model for Speech Recognition. IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, 25-30 March 2012, 4277-4280. [Google Scholar] [CrossRef
[4] Kim, Y. (2014) Convolutional Neural Networks for Sentence Classification. Proceeding of the Conference on Empirical Methods in Natural Language Processing, Doha, 16. [Google Scholar] [CrossRef
[5] Kim, Y., Jernite, Y., Sontag, D., et al. (2015) Character-Aware Neural Language Models. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, 12-17 February 2015, 2741-2749.
[6] Kalchbrenner, N., Grefenstette, E. and Blunsom, P. (2014) A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, 23-25 June 2014, 655-665. [Google Scholar] [CrossRef
[7] Zhang, X., Zhao, J. and Lecun, Y. (2015) Character-Level Convolutional Networks for Text Classification.
[8] Poon, H.K., Yap, W.S., Tee, Y.K., et al. (2018) Document Level Polarity Classification with Attention Gated Recurrent Unit. International Conference on Information Networking, Chiang Mai, 10-12 January 2018, 7-12. [Google Scholar] [CrossRef
[9] 冯兴杰, 张志伟, 史金钏. 基于卷积神经网络和注意力模型的文本情感分析[J]. 计算机应用研究, 2018(5): 1434-1436.
[10] 何炎祥, 孙松涛, 牛菲菲, 等. 用于微博情感分析的一种情感语义增强的深度学习模型[J]. 计算机学报, 2017, 40(4): 773-790.
[11] Zhou, C., Sun, C., Liu, Z., et al. (2015) C-LSTM Neural Network for Text Classification. Computer Science, 1, 39-44.
[12] Joulin, A., Grave, E., Bojanowski, P., et al. (2016) Bag of Tricks for Efficient Text Classification. [Google Scholar] [CrossRef
[13] Lin, M., Chen, Q. and Yan, S. (2013) Network in Net-work.
[14] Szegedy, C., Liu, W., Jia, Y., et al. (2014) Going Deeper with Convolutions. IEEE Conference on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 1-9. [Google Scholar] [CrossRef
[15] Long, J., Shelhamer, E. and Darrell, T. (2014) Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39, 640-651.
[16] 张曼, 夏战国, 刘兵, 周勇. 全卷积神经网络的字符级文本分类方法[J/OL]. 计算机工程与应用, 1-11. http://kns.cnki.net/kcms/detail/11.2127.TP.20190327.1747.010.html, 2019-10-05.
[17] Mikolov, T., Chen, K., Corrado, G., et al. (2013) Efficient Estimation of Word Representations in Vector Space.
[18] Pennington, J., Socher, R. and Manning, C. (2014) Glove: Global Vectors for Word Representation. Conference on Empirical Methods in Natural Language Processing, Doha, 1532-1543.
[19] Peters, M.E., Neumann, M., Iyyer, M., et al. (2018) Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, 2227-2237.
[20] Ioffe, S. and Szegedy, C. (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. International Conference on Machine Learning.