基于深度学习的网络评论文本情感分析方法
A Deep Learning-Based Text Sentiment Analysis Method for Web Comments
DOI: 10.12677/mos.2024.135487, PDF,    国家自然科学基金支持
作者: 李大一, 王友国*, 翟其清:南京邮电大学理学院,江苏 南京
关键词: 文本情感分析RoBERTa-wwm-extBi-LSTMMutil-Head-AttentionText Emotion Analysis RoBERTa-wwm-ext Bidirectional LSTM Mutil-Head-Attention
摘要: 在自然语言处理的众多研究领域中,文本的情感层面分析已成为一个备受瞩目的课题。针对情感分析任务中存在的文本向量表示语义不佳和特征提取不足导致分类不准确的问题,本研究提出了一种融合了RoBERTa-wwm-ext模型和多头注意力机制的深度学习文本分类框架RoBERTa-BiLSTM-Mutil-Head-Attention (RBM)。模型最初利用预训练的RoBERTa-wwm-ext语言模型捕获文本的动态特性;利用双向长短期记忆网络Bi-LSTM进一步提取文本更深层次的语义关系,将最后一个时序输出作为特征向量输入到多头注意力机制层;最后通过全连接层神经网络得到文本分类结果。经过一系列模型的对比测试,本研究提出的基于RBM的分类模型在ChnSentiCorp的网络评论文本集上实现了更高的准确度、精确率、召回率和F1值,且模型较好地提取了文本中字词的特征,提高了中文评论文本情感分析的效果。
Abstract: Among the many research areas in natural language processing, sentiment analysis of text has become a highly focused topic. To address issues with poor semantic representation and insufficient feature extraction in sentiment analysis tasks, which lead to inaccurate classification, this study proposes a deep learning text classification framework called RoBERTa-BiLSTM-Multi-Head-Attention (RBM), which integrates the RoBERTa-wwm-ext model with a multi-head attention mechanism. The model initially utilizes the pre-trained RoBERTa-wwm-ext language model to capture the dynamic characteristics of the text; it then uses a bidirectional long short-term memory network (Bi-LSTM) to further extract deeper semantic relationships from the text, with the final time-step output being used as a feature vector input into the multi-head attention mechanism layer; finally, a fully connected neural network layer yields the text classification results. Through a series of comparative tests, the RBM-based classification model proposed in this study achieved higher accuracy, precision, recall, and F1 scores on the ChnSentiCorp network comment text dataset, and the model effectively extracted word features from the text, improving the effectiveness of sentiment analysis on Chinese review texts.
文章引用:李大一, 王友国, 翟其清. 基于深度学习的网络评论文本情感分析方法[J]. 建模与仿真, 2024, 13(5): 5372-5381. https://doi.org/10.12677/mos.2024.135487

参考文献

[1] 陈迪, 程朗, 王志锋, 等. 论坛情感挖掘研究综述: 现状、挑战与趋势[J]. 计算机工程与应用, 2021, 57(17): 17-28.
[2] 王颖洁, 朱久祺, 汪祖民, 等. 自然语言处理在文本情感分析领域应用综述[J]. 计算机应用, 2022, 42(4): 1011-1020.
[3] 李梦楠, 汪明艳. 基于机器学习的情感分析方法及应用研究综述[J]. 软件工程, 2021, 24(9): 21-23.
[4] Mikolov, T., Sutskever, I., Chen, K., et al. (2013) Distributed Representations of Words and Phrases and Their Compositionality. Advances in Neural Information Processing Systems, 26, 3111-3119.
[5] 王煜涵, 张春云, 赵宝林, 等. 卷积神经网络下的Twiter文本情感分析[J]. 数据采集与处理, 2018, 33(5): 921-927.
[6] 段丹丹, 唐加山, 温勇, 等. 基于BERT模型的中文短文本分类算法[J]. 计算机工程, 2021, 47(1): 79-86.
[7] 夏宇同. 基于RoBERTa模型的文本情感分析研究[D]: [硕士学位论文]. 秦皇岛: 燕山大学, 2023.
[8] 贾中昕. 基于深度学习的短文本情感分析系统的设计与实现[D]: [硕士学位论文]. 南京: 南京邮电大学, 2022.
[9] 胡俊玮, 于青. 基于BERT-BGRU-Att模型的中文文本情感分析[J]. 天津理工大学学报, 2024, 40(3): 85-90.
[10] 兰正寅, 周艳玲, 张龑, 等. 基于RoBERTa-ATTLSTM新闻分类方法研究[J]. 计算机与数字工程, 2023, 51(11): 2620-2626.
[11] Cui, Y., Che, W., Liu, T., Qin, B., Wang, S. and Hu, G. (2020) Revisiting Pre-Trained Models for Chinese Natural Language Processing. Findings of the Association for Computational Linguistics: EMNLP 2020, November 2020, 657-668. [Google Scholar] [CrossRef
[12] Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. [Google Scholar] [CrossRef] [PubMed]
[13] Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. (2017) Attention Is All You Need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 6000-6010.
[14] 熊瑞婷. 基于深度学习的微博文本情感分析研究与应用[D]: [硕士学位论文]. 南昌: 南昌大学, 2023.
[15] 赵蕾, 夏吉安, 吴洋, 等. 基于Spark平台的分类算法性能比较分析[J]. 计算机与数字工程, 2024, 52(3): 688-691+704.
[16] 王高飞. 基于深度学习的社交文本情感分析研究[D]: [硕士学位论文]. 哈尔滨: 哈尔滨理工大学, 2023.