基于Lora微调的阿尔兹海默症知识图谱构建
Construction of Alzheimer’s Disease Knowledge Graph Based on LoRa Fine-Tuning
DOI: 10.12677/csa.2025.1512325, PDF,   
作者: 金 瀚, 李 莉:天津职业技术师范大学电子工程学院,天津
关键词: 大语言模型LoRA微调阿尔兹海默症Neo4jLarge Language Model LoRA Fine-Tuning Alzheimer’s Disease Neo4j
摘要: 阿尔茨海默症是一种不可逆的神经退行性疾病,利用知识图谱对阿尔茨海默症和轻度认知障碍患者进行准确的辅助诊断具有重要意义。然而传统知识图谱构建方法通常依赖大量人工标注,成本高且领域适应性有限。近年来,人工智能技术特别是大语言模型的快速发展,为此提供了新的技术支撑。本文提出一种基于大语言模型与Lora微调的阿尔茨海默症知识图谱构建方法,旨在为低资源、低成本场景下高效构建知识图谱提供参考。该方法通过设计合理的信息抽取提示模板并构建指令数据集,分别采用中英文语料的5个大语言模型进行少样本的Lora微调,对比分析实体关系联合抽取的不同表现。实验结果表明,Llama-3.1-Tulu-3-8B在实体关系联合抽取方面表现最优,在60个训练轮次下精确率达到82.5%,并进一步实现了从相关文献中自动抽取阿尔茨海默症知识,并完成知识图谱的构建与可视化分析。
Abstract: Alzheimer’s disease is an irreversible neurodegenerative disorder. Utilizing knowledge graphs for accurate auxiliary diagnosis of patients with Alzheimer’s disease and mild cognitive impairment holds significant importance. However, traditional knowledge graph construction methods often rely on extensive manual annotation, which is costly and has limited domain adaptability. In recent years, the rapid development of artificial intelligence technology, especially large language models, has provided new technical support for this purpose. This paper proposes a method for constructing an Alzheimer’s disease knowledge graph based on large language models and Lora fine-tuning, aiming to provide a reference for efficiently constructing knowledge graphs in low-resource and low-cost scenarios. This method involves designing reasonable information extraction prompt templates and constructing an instruction dataset. Five large language models in both Chinese and English corpora are employed for few-shot Lora fine-tuning, and the different performances of joint entity relation extraction are comparatively analyzed. Experimental results show that Llama-3.1-Tulu-3-8B performs optimally in joint entity relation extraction, achieving an accuracy rate of 82.5% after 60 training epochs. Furthermore, it automatically extracts Alzheimer’s disease knowledge from relevant literature, completes the construction of the knowledge graph, and performs visual analysis.
文章引用:金瀚, 李莉. 基于Lora微调的阿尔兹海默症知识图谱构建[J]. 计算机科学与应用, 2025, 15(12): 100-111. https://doi.org/10.12677/csa.2025.1512325

参考文献

[1] 张雷, 范占芳, 张作鹏, 程卯生, 刘洋. 阿尔兹海默症发病机制及相关治疗药物的研究进展[J]. 中国药物化学杂志, 2021, 31(6): 438-446+469.
[2] 邓青芳, 马风伟. 阿尔兹海默病的发病机制及药物治疗研究进展[J]. 贵州师范大学学报(自然科学版), 2020, 38(1): 104-111.
[3] 王威丽, 宋沧桑. 阿尔兹海默病发病机制的研究进展及临床用药[J]. 中国药物评价, 2019, 36(3): 204-209.
[4] 曾安, 贾龙飞, 潘丹, 等. 基于卷积神经网络和集成学习的阿尔茨海默症早期诊断[J]. 生物医学工程学杂志, 2019, 36(5): 711-719.
[5] 楚阳, 徐文龙. 基于计算机辅助诊断技术的阿尔兹海默症早期分类研究综述[J]. 计算机工程与科学, 2022, 44(5): 879-893.
[6] Hogan, A., Blomqvist, E., Cochez, M., D’amato, C., Melo, G.D., Gutierrez, C., et al. (2020) Knowledge Graphs. ACM Computing Surveys, 54, 1-37. [Google Scholar] [CrossRef
[7] Rebele, T., Suchanek, F., Hoffart, J., Biega, J., Kuzey, E. and Weikum, G. (2016) YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames. In: Groth, P., et al., Eds., Lecture Notes in Computer Science, Springer International Publishing, 177-185. [Google Scholar] [CrossRef
[8] Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R. and Ives, Z. (2007) DBpedia: A Nucleus for a Web of Open Data. In: Aberer, K., et al., Eds., Lecture Notes in Computer Science, Springer, 722-735. [Google Scholar] [CrossRef
[9] 陈涛, 刘炜, 单蓉蓉, 等. 知识图谱在数字人文中的应用研究[J]. 中国图书馆学报, 2019, 45(6): 34-49.
[10] Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M., Socher, R., Amatriain, X. and Gao, J. (2024) Large Language Models: A Survey. ArXiv, abs/2402.06196.
[11] Zhang, Z., Cao, L., Chen, X., Tang, W., Xu, Z. and Meng, Y. (2020) Representation Learning of Knowledge Graphs with Entity Attributes. IEEE Access, 8, 7435-7441. [Google Scholar] [CrossRef
[12] Nickel, M., Murphy, K., Tresp, V. and Gabrilovich, E. (2015) A Review of Relational Machine Learning for Knowledge Graphs. Proceedings of the IEEE, 104, 11-33. [Google Scholar] [CrossRef
[13] Kim, B., Hong, T., Ko, Y. and Seo, J. (2020) Multi-Task Learning for Knowledge Graph Completion with Pre-Trained Language Models. Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, 8-13 December 2020, 1737-1743. [Google Scholar] [CrossRef
[14] Gao, X. and Li, Q. (2021) Named Entity Recognition in Material Field Based on Bert-Bilstm-Attention-CRF. 2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS), Shenyang, 10-11 December 2021, 955-958. [Google Scholar] [CrossRef
[15] Pellissier Tanon, T., Vrandečić, D., Schaffert, S., Steiner, T. and Pintscher, L. (2016) From Freebase to Wikidata: The Great Migration. Proceedings of the 25th International Conference on World Wide Web, Montréal, 11-15 April 2016, 1419-1428. [Google Scholar] [CrossRef
[16] Vrandečić, D. and Krötzsch, M. (2014) Wikidata. Communications of the ACM, 57, 78-85. [Google Scholar] [CrossRef
[17] Chen, H., Hu, N., Qi, G., Wang, H., Bi, Z., Li, J., et al. (2021) OpenKG Chain: A Blockchain Infrastructure for Open Knowledge Graphs. Data Intelligence, 3, 205-227. [Google Scholar] [CrossRef
[18] Venugopal, V. and Olivetti, E. (2024) Matkg: An Autonomously Generated Knowledge Graph in Material Science. Scientific Data, 11, Article No. 217. [Google Scholar] [CrossRef] [PubMed]
[19] Sun, K., Yu, S., Peng, C., Wang, Y., Alfarraj, O., Tolba, A., et al. (2022) Relational Structure-Aware Knowledge Graph Representation in Complex Space. Mathematics, 10, Article 1930. [Google Scholar] [CrossRef
[20] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., Kaiser, L. and Polosukhin, I. (2017) Attention Is All You Need. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 5998-6008.
[21] Hu, J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S. and Chen, W. (2021) LoRA: Low-Rank Adaptation of Large Language Models. ArXiv, abs/2106.09685.
[22] Wen, J., Thibeau-Sutre, E., Diaz-Melo, M., Samper-González, J., Routier, A., Bottani, S., et al. (2019) Convolutional Neural Networks for Classification of Alzheimer’s Disease: Overview and Reproducible Evaluation. Medical Image Analysis, 63, Article 101694. [Google Scholar] [CrossRef] [PubMed]
[23] Zeng, A., Xu, B., Wang, B., et al. (2024) ChatGLM: A Family of Large Language Modelsfrom GLM-130B to GLM-4 All Tools. arXiv:2406.12793.
[24] Yang, A., et al. (2025) Qwen3 Technical Report. ArXiv, abs/2505.09388.
[25] Guo, D., Yang, D., Zhang, H., et al. (2025) Deepseek-R1: Incentivizing Reasoningcapability in Llms via Rein-Forcement Learning. arXiv:2501.12948.
[26] Meta (2024) Introducing Llama 3.1: Our Most Capable Models to Date.
https://ai.meta.com/blog/meta-llama-3-1/
[27] Gemma Team (2024) Gemma 2: Improving Open Language Models at a Practical Size. ArXiv, abs/2408.00118.