态靶辨治语料的深度学习标注方法
Deep Learning Annotation Methods for the Corpora of State-Target Differentiation and Treatment
DOI: 10.12677/csa.2025.157185, PDF,    科研立项经费支持
作者: 蒋欣然:宁夏医科大学医学信息与工程学院,宁夏 银川;宁夏医科大学中医学院,宁夏 银川;连世新, 董富江, 张文学*:宁夏医科大学医学信息与工程学院,宁夏 银川;马启国:宁夏医科大学医学信息与工程学院,宁夏 银川;宁夏回族自治区银川市灵武市白土岗乡人民政府,宁夏 银川;马 静:宁夏医科大学医学信息与工程学院,宁夏 银川;美华建筑设计有限公司宁夏分公司,宁夏 银川
关键词: 态靶辨治语料库标注深度学习命名实体识别实体关系抽取State-Target Differentiation and Treatment Corpus Annotation Deep Learning Named Entity Recognition Entity Relationship Extraction
摘要: 针对中医“态靶辨治”理论语料标注中存在的术语复杂性、表达灵活性及领域知识依赖等挑战,本研究提出一种融合深度学习的两阶段标注方法体系。首先,基于BERT + BiLSTM + CRF模型实现高精度命名实体识别,准确抽取疾病、证候、态、靶药等六类核心实体(F1值达72.06%);其次,通过Attention + BiLSTM模型完成实体关系抽取,精准捕捉“疾病–证候–靶药”间的诊疗逻辑关系(F1值达98.23%)。实验表明,该方法显著提升标注效率与一致性,为构建态靶辨治知识图谱及开发智能诊疗系统提供了可靠的技术支撑。
Abstract: To address challenges in annotating Traditional Chinese Medicine (TCM) corpora for “State-Target Differentiation and Treatment” (STDT), including terminological complexity, expressive flexibility, and domain knowledge dependency, this study proposes a dual-stage deep learning annotation framework integrating deep learning. First, a BERT + BiLSTM + CRF model achieves high-precision named entity recognition, accurately extracting six core entities, such as diseases, syndromes, states, and target herbs (with an F1-score of 72.06%). Second, an Attention + BiLSTM model completes the extraction of entity relationships, accurately capturing diagnosis and treatment logical relationships between “disease-syndrome-target herb” (with an F1-score of 98.23%). Experiments demonstrate significant improvements in annotation efficiency and consistency, providing a robust technical foundation for constructing STDT knowledge graphs and developing intelligent diagnosis and treatment systems.
文章引用:蒋欣然, 连世新, 董富江, 马启国, 马静, 张文学. 态靶辨治语料的深度学习标注方法[J]. 计算机科学与应用, 2025, 15(7): 100-113. https://doi.org/10.12677/csa.2025.157185

参考文献

[1] 仝小林. 态靶医学——中医未来发展之路[J]. 中国中西医结合杂志, 2021, 41(1): 16-18.
[2] Gou, X.W., Gao, Z.Z., Yang, Y.Y., Li, Q.W., Chen, K.Y., Lei, Y., Song, B., Zhao, L.H. and Tong, X.L. (2021) State-Target Strategy: A Bridge for the Integration of Chinese and Western Medicine. Journal of Traditional Chinese Medicine, 41, 1-5.
[3] 原旎, 卢克治, 袁玉虎, 等. 基于深度表示的中医病历症状表型命名实体抽取研究[J]. 世界科学技术-中医药现代化, 2018, 20(3): 355-362.
[4] 谢云霏, 贾李蓉. 中医药知识图谱构建技术及应用的研究进展[J]. 中国医药导报, 2024, 21(20): 62-66.
[5] 胡嘉元, 邱瑞瑾, 孙杨, 等. 自然语言处理及其在医学领域的应用. 中国循证医学杂志, 2024, 24(10): 1205-1211.
[6] Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H., et al. (2019) BioBERT: A Pre-Trained Biomedical Language Representation Model for Biomedical Text Mining. Bioinformatics, 36, 1234-1240. [Google Scholar] [CrossRef] [PubMed]
[7] Wu, S., Roberts, K., Datta, S., Du, J., Ji, Z., Si, Y., et al. (2019) Deep Learning in Clinical Natural Language Processing: A Methodical Review. Journal of the American Medical Informatics Association, 27, 457-470. [Google Scholar] [CrossRef] [PubMed]
[8] Liu, P., Guo, Y., Wang, F. and Li, G. (2022) Chinese Named Entity Recognition: The State of the Art. Neurocomputing, 473, 37-53. [Google Scholar] [CrossRef
[9] Liu, Z., Luo, C., Zheng, Z., Li, Y., Fu, D., Yu, X., et al. (2021) TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition. Journal of Healthcare Engineering, 2021, Article ID: 3544281. [Google Scholar] [CrossRef] [PubMed]