OptiCacheRAG:融合知识图谱与分布式缓存的多级自适应检索增强生成方法
OptiCacheRAG: Multi-Level Adaptive RAG with Knowledge Graph and Distributed Cache
DOI: 10.12677/airr.2025.146132, PDF,   
作者: 秦 疆:北京信息科技大学计算机学院,北京;施水才, 王洪俊, 张亚豪:北京信息科技大学计算机学院,北京;拓尔思信息技术股份有限公司,北京
关键词: 大语言模型知识图谱检索增强生成自适应多级检索分布式缓存Large Language Models (LLMs) Knowledge Graph Retrieval-Augmented Generation (RAG) Adaptive Multi-Level Retrieval Distributed Cache
摘要: 大模型在自然语言处理领域展现出强大的生成与推理能力,但幻觉现象、算力成本过高等问题严重制约其应用落地。检索增强生成(RAG)技术通过引入外部知识有效缓解了幻觉问题,而融合知识图谱的GraphRAG进一步解决了传统RAG平面知识表示的缺陷,实现了实体关联建模与多跳推理。然而,GraphRAG在知识图谱构建与问答过程中频繁调用大模型,导致算力消耗激增与响应延迟过长,难以形成“效果–成本”正反馈。为解决该问题,本文提出OptiCacheRAG模型,集成知识图谱驱动的文本索引范式、自适应多级检索框架与Redis分布式缓存机制。该模型首先通过问题分级机制,对低级问题采用轻量化检索、对全局性问题启用深度检索策略,动态匹配用户需求与检索资源;其次利用LRU算法维护分布式缓存,复用关联问题的实体知识以减少重复检索开销。实验从响应速度、检索精度与问答质量三个维度展开评估,结果表明,OptiCacheRAG在保障知识关联建模能力的同时,显著降低了算力成本与响应时间,相较于基线方法实现了性能与效率的协同优化。本文的核心贡献在于:1) 提出自适应多级检索结构以实现“需求–资源”精准匹配;2) 引入Redis分布式缓存以复用关联知识;3) 通过实证验证了模型在效率与精度上的综合优势。
Abstract: Large language models (LLMs) show strong generation and reasoning abilities in NLP, yet hallucination and high computational costs impede their practical use. Retrieval-Augmented Generation (RAG) mitigates hallucinations via external knowledge, and GraphRAG (integrating knowledge graphs) resolves traditional RAG’s flat knowledge representation issue, supporting entity relationship modeling and multi-hop reasoning. However, GraphRAG’s frequent LLM invocations in knowledge graph construction and QA cause soaring computation and long latency, failing to form a positive “performance-cost” feedback loop. To address this, we propose OptiCacheRAG, integrating a know-ledge graph-driven text indexing paradigm, adaptive multi-level retrieval, and Redis-based distributed caching. It classifies user questions: lightweight retrieval for low-level ones and deep retrieval for global ones, dynamically matching demand with resources. Additionally, it uses the LRU algorithm to maintain the cache, reusing entity knowledge from related questions to cut redundant overhead. Experiments on response speed, retrieval accuracy, and QA quality show that OptiCacheRAG retains knowledge modeling capability while reducing computational costs and latency significantly, realizing performance-efficiency synergy versus baselines. Core contributions: 1) An adaptive multi-level retrieval structure for accurate “demand-resource” matching; (2) Redis-based distributed caching for knowledge reuse; (3) Empirical validation of its efficiency and accuracy advantages.
文章引用:秦疆, 施水才, 王洪俊, 张亚豪. OptiCacheRAG:融合知识图谱与分布式缓存的多级自适应检索增强生成方法[J]. 人工智能与机器人研究, 2025, 14(6): 1410-1423. https://doi.org/10.12677/airr.2025.146132

参考文献

[1] Hurst, A., Lerer, A., Goucher, A.P., Perelman, A., Ramesh, A., et al. (2024) GPT-4o System Card.
https://arxiv.org/abs/2410.21276
[2] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., et al. (2024) GPT-4 Technical Report.
https://arxiv.org/abs/2303.08774
[3] Guu, K., Lee, K., Tung, Z., Pasupat, P. and Chang, M.-W. (2020) Realm: Retrieval-Augmented Language Model Pre-training. 2020 International Conference on Machine Learning, Vienna, 12-18 July 2020, 3929-3938.
[4] Izacard, G., Lewis, P., Lomeli, M., Hosseini, L., et al. (2023) Atlas: Few-Shot Learning with Retrieval Augmented Language Models. Journal of Machine Learning Research, 24, 1-43.
[5] Drozdov, A., Zhuang, H., Dai, Z., Qin, Z., Rahimi, R., Wang, X., et al. (2023). PaRaDe: Passage Ranking Using Demonstrations with LLMs. Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 2023, 14242-14252.[CrossRef
[6] Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., et al. (2020) Dense Passage Retrieval for Open-Domain Question Answering. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16-20 November 2020, 6769-6781. [Google Scholar] [CrossRef
[7] Bevilacqua, M., Ottaviano, G., Lewis, P.S.H., et al. (2022) Autoregressive Search Engines: Generating Substrings as Document Identifiers. Advances in Neural Information Processing Systems, 35, 31668-31683.
[8] Asai, A., Wu, Z., Wang, Y., Sil, A. and Hajishirzi, H. (2023) Self-Rag: Learning to Retrieve, Generate, and Critique through Self-Reflection.
https://arxiv.org/abs/2310.11511
[9] Kulkarni, M., Tangarajan, P., Kim, K. and Trivedi, A. (2024) Reinforcement Learning for Optimizing Rag for Domain Chatbots.
https://arxiv.org/abs/2401.06800
[10] Pradeep, R., Sharifymoghaddam, S. and Lin, J. (2023) RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models.
https://arxiv.org/abs/2309.15088
[11] Lin, C.Y. (2004) ROUGE: A Package for Automatic Evaluation of Summaries. In: Text Summarization Branches out, Association for Computational Linguistics, 74-81.
[12] Khandelwal, U., Levy, O., Jurafsky, D., Zettlemoyer, L. and Lewis, M. (2020) Generalization through Memorization: Nearest Neighbor Language Models. 2020 International Conference on Learning Representations (ICLR), Addis Ababa, 26-30 April 2020.
https://arxiv.org/abs/1911.00172
[13] Liu, Z., Ping, W., Roy, R., et al. (2024) ChatQA: Surpassing GPT-4 on Conversational QA and RAG.
https://arxiv.org/abs/2401.10225