一种基于RocketQA与ChatGLM模型的问答机器人系统研究与设计
Research and Design of a Question-Answering Robot System Based on RocketQA and ChatGLM Model
摘要: 随着信息技术的快速发展,用户对高效获取精准信息的需求日益增长。此外,为了支持国产操作系统软件生态,文章基于Deepin操作系统,集成RocketQA和ChatGLM模型,设计并实现了一个智能文档问答系统。系统首先使用RocketQA模型将问题向量化,利用Faiss进行检索和排序,然后由RocketQA模型二次搜索和排序,对结果进行增强,检索到的文档通过ChatGLM模型转化为自然语言答案,以“参考链接 + 答案”的格式呈现给用户。最后,在deepin wiki数据集上进行了广泛的测试,结果表明系统在问答准确性和响应速度方面均表现优异。
Abstract: With the rapid development of information technology, users’ demand for efficiently obtaining accurate information is growing. In addition, to support the ecosystem of domestic operating system software, this paper designs and implements an intelligent document question-answering system based on the Deepin operating system, integrating the RocketQA and ChatGLM models. The system first uses the RocketQA model to vectorize the question, utilizes Faiss for retrieval and sorting, then uses the RocketQA model for a second search and sorting to enhance the results. The retrieved documents are converted into natural language answers by the ChatGLM model and presented to the user in the format of “reference link + answer”. Finally, extensive testing was conducted on the deepin wiki dataset, and the results show that the system performs excellently in both question-answering accuracy and response speed.
文章引用:宋丽萍, 王天与, 宋丽华, 孙虹飞. 一种基于RocketQA与ChatGLM模型的问答机器人系统研究与设计[J]. 计算机科学与应用, 2024, 14(11): 21-27. https://doi.org/10.12677/csa.2024.1411212

参考文献

[1] 国产操作系统迎来新突破[J]. 市场瞭望, 2024(15): 1.
[2] Qu, Y., Ding, Y., Liu, J., Liu, K., Ren, R., Zhao, W.X., et al. (2021). RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering. arXiv: 2010.08191.
[3] Zhang, X., Zhang, X. and Yu, Y. (2023). ChatGLM-6B Fine-Tuning for Cultural and Creative Products Advertising Words. 2023 International Conference on Culture-Oriented Science and Technology (CoST), Xi’an, 11-14 October 2023, 291-295. [CrossRef
[4] Douze, M., Guzhva, A., Deng, C., et al. (2024) The Faiss Library. arXiv: 2401.08281.
[5] 关殿玺, 黄琨, 崔年治. 基于大模型、RAG和智能体技术的勘察岩土问答机器人研究[J]. 中国勘察设计, 2024(8): 101-104.
[6] Devlin, J., Chang, M.W., Lee, K., et al. (2018) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv: 1810.04805.
[7] Liu, Y., Ott, M., Goyal, N., et al. (2019) Roberta: A Robustly Optimized Bert Pretraining Approach. arXiv: 1907.11692.
[8] Ni, J., Abrego, G.H., Constant, N., et al. (2021) Sentence-t5: Scalable Sentence Encoders from Pre-Trained Text-To-Text Models. arXiv: 2108.08877.
[9] 王若佳, 范科鸣, 刘智锋, 等. 生成式人工智能环境下用户信息检索式行为研究[J/OL]. 数据分析与知识发现: 1-15.
http://kns.cnki.net/kcms/detail/10.1478.G2.20240117.1057.008.html
, 2024-09-09.
[10] 赵芸, 刘德喜, 万常选, 等. 检索式自动问答研究综述[J]. 计算机学报, 2021, 44(6): 1214-1232.
[11] 刘邦奇, 聂小林, 王士进, 等. 生成式人工智能与未来教育形态重塑: 技术框架、能力特征及应用趋势[J]. 电化教育研究, 2024, 45(1): 13-20.
[12] 黄施洋, 奚雪峰, 崔志明. 大模型时代下的汉语自然语言处理研究与探索[J/OL]. 计算机工程与应用: 1-19.
http://kns.cnki.net/kcms/detail/11.2127.tp.20240925.1046.017.html
, 2024-10-01.