基于双向GRU与Attention机制的改进Seq2Seq自动文本摘要模型研究
A Bidirectional GRU and Attention-Based Improved Seq2Seq Model for Automatic Text Summarization
DOI: 10.12677/csa.2025.1511307, PDF,    科研立项经费支持
作者: 许美玲, 裘天瑜:河北金融学院信息与人工智能学院,河北 保定
关键词: 自动文本摘要Seq2Seq模型注意力机制双向GRUAutomatic Text Summarization Seq2Seq Model Attention Mechanism Bidirectional GRU (BiGRU)
摘要: 随着大数据时代的到来,数据规模呈指数级增长,信息过载问题日益突出。自动文本摘要技术通过计算机模型提炼文本主旨,生成简洁摘要,能够有效缓解信息过载,并广泛应用于新闻标题生成、文本检索与智能问答等场景。本文在分析现有文本摘要技术的基础上,以Seq2Seq模型为研究核心,重点探讨注意力机制(Attention Mechanism)在摘要生成中的作用,并提出一种结合双向GRU与注意力机制的改进型Seq2Seq模型。该模型在编码器部分采用双向GRU (BiGRU)结构,以充分捕获上下文语义信息;在解码器部分引入注意力机制,以提升摘要生成的准确性与连贯性。本文基于CNN/Daily Mail英文数据集对所提出模型进行训练与测试,并采用ROUGE指标评估实验结果。实验表明,所提出模型在摘要质量与信息覆盖度方面均优于传统Seq2Seq模型,验证了其有效性与可行性。
Abstract: With the advent of the big data era, data volumes are growing exponentially, and the problem of information overload has become increasingly prominent. Automatic text summarization technology, which distills the main ideas of a text through computational models to generate concise summaries, can effectively alleviate information overload and is widely applied in scenarios such as news headline generation, text retrieval, and intelligent question answering. Based on an analysis of existing text summarization techniques, this paper focuses on the Seq2Seq model and explores the role of the Attention Mechanism in summary generation. A novel improved Seq2Seq model combining bidirectional GRU (BiGRU) and attention mechanism is proposed. In this model, the encoder employs a BiGRU structure to fully capture contextual semantic information, while the decoder incorporates an attention mechanism to enhance the accuracy and coherence of generated summaries. The model is trained and tested on the CNN/Daily Mail English dataset, and evaluation is conducted using the ROUGE metric. Experimental results demonstrate that the proposed model outperforms the traditional Seq2Seq model in both summary quality and information coverage, validating its effectiveness and feasibility.
文章引用:许美玲, 裘天瑜. 基于双向GRU与Attention机制的改进Seq2Seq自动文本摘要模型研究[J]. 计算机科学与应用, 2025, 15(11): 320-330. https://doi.org/10.12677/csa.2025.1511307

参考文献

[1] 王文静, 张宏宇, 李明. 自动文本摘要技术研究综述[J]. 计算机研究与发展, 2023, 60(8): 1572-1588.
[2] 邢淼. 国内外大语言模型生成中文论文摘要对比研究[J]. 知识管理论坛, 2024, 9(2): 45-52.
[3] 裴炳森. 基于大语言模型的司法文本摘要生成与评价技术研究[J]. 情报科学, 2024, 42(6): 88-97.
[4] Luhn, H.P. (1958) The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development, 2, 159-165. [Google Scholar] [CrossRef
[5] Kupiec, J., Pedersen, J. and Chen, F. (1995) A trainable Document Summarizer. Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval-SIGIR ‘95, Washington, 9-13 July 1995, 68-73. [Google Scholar] [CrossRef
[6] Mihalcea, R. and Tarau, P. (2004) TextRank: Bringing Order into Texts. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, 25-27 July 2004, 404-411.
[7] Sutskever, I., Vinyals, O. and Le, Q.V. (2014) Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems, 27, 3104-3112.
[8] Rush, A.M., Chopra, S. and Weston, J. (2015) A Neural Attention Model for Abstractive Sentence Summarization. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, 17-21 September 2015, 379-389. [Google Scholar] [CrossRef
[9] See, A., Liu, P.J. and Manning, C.D. (2017) Get to the Point: Summarization with Pointer-Generator Networks. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 30 July-4 August 2017, 1073-1083. [Google Scholar] [CrossRef
[10] 张扬, 金涵蕾, 孟丹, 王骏, 谭晶华. 基于大语言模型的自动文本摘要研究综述[J]. 数据分析与知识发现, 2025, 9(1): 1-14.
[11] 祁天, 杨建安, 赵铁军, 杨沐昀. 基于思维链的跨语言多文档摘要生成技术研究[C]. 中国计算语言学年会论文集(CCL 2024). 北京: 中国中文信息学会, 2024: 98-108.
[12] Bahdanau, D., Cho, K. and Bengio, Y. (2015) Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473
[13] Hermann, K.M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M. and Blunsom, P. (2015) Teaching Machines to Read and Comprehend. Advances in Neural Information Processing Systems, Montreal, 1693-1701.
[14] Lin, C.-Y. (2004) Rouge: A Package for Automatic Evaluation of Summaries. Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, 25-26 July 2004, 74-81.
[15] Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013) Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781.
https://arxiv.org/abs/1301.3781