基于指针标注的中文医学文本实体关系抽取研究
Research on Entity Relation Extraction of Chinese Medical Text Based on Pointer Tagging Framework
DOI: 10.12677/CSA.2022.121018, PDF, HTML, XML, 下载: 354  浏览: 561 
作者: 罗文龙, 王 勇:广东工业大学,计算机学院,广东 广州
关键词: 实体关系抽取中文医学文本关系重叠词组信息Entity Relation Extraction Chinese Medical Text Overlapping Triples Word Information
摘要: 随着医学领域科学技术的不断发展,产生了大量的医学文本数据,如何从海量的非结构化数据中获取有效的信息成为医学和自然语言处理的研究热点。作为信息抽取的关键一环,实体关系抽取可以获取自然语言句子中实体对及其之间的语义关系。当前中文医学文本的实体关系抽取方法存在词组信息缺失和关系重叠等问题,基于此,本文提出了一个Flat-Lattice-指针标注联合抽取模型。利用相对位置对词组信息进行编码,增强实体边界,并通过指针标注框架,将关系作为一种主实体到客实体的映射函数,解决了关系重叠的问题。在中文医学文本数据集上与多个基准模型进行对比,证明了该模型在中文医学文本实体关系抽取上的有效性,其准确率、召回率和F1值均高于基准模型。
Abstract: With the continuous development of science and technology in the medical field, a large number of medical text data have been produced. How to obtain effective information from massive un-structured data has become a research hotspot in medicine and natural language processing. As a key part of information extraction, entity relation extraction can obtain entity pairs and their semantic relations in natural language sentences. At present, there are some problems in the entity relation extraction methods of Chinese medical texts, such as lack of word information and over-lapping triples. Based on this, this paper proposes a flat-lattice-pointer-tagging joint extraction model. The word information is encoded by relative position to enhance the entity boundary. Through the pointer tagging framework, the relationship is regarded as a mapping function from subject to object, which solves the problem of overlapping. Compared with several benchmark models on Chinese medical text data set, it proves the effectiveness of the model in entity relation extraction of Chinese medical text, and its accuracy, recall and F1 value are higher than those of the benchmark model.
文章引用:罗文龙, 王勇. 基于指针标注的中文医学文本实体关系抽取研究[J]. 计算机科学与应用, 2022, 12(1): 169-177. https://doi.org/10.12677/CSA.2022.121018

参考文献

[1] Li, Q. and Ji, H. (2014) Incremental Joint Extraction of Entity Mentions and Relations. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 1, 402-412.
https://doi.org/10.3115/v1/P14-1038
[2] Dai, D., Xiao, X., Lyu, Y., Dou, S. and Wang, H. (2019) Joint Extrac-tion of Entities and Overlapping Relations Using Position-Attentive Sequence Labeling. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 6300-6308.
https://doi.org/10.1609/aaai.v33i01.33016300
[3] Miwa, M. and Bansal, M. (2016) End-to-End Relation Ex-traction Using LSTMs on Sequences and Tree Structures. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 1, 1105-1116.
https://doi.org/10.18653/v1/P16-1105
[4] Katiyar, A. and Cardie, C. (2017) Going out on a Limb: Joint Extrac-tion of Entity Mentions and Relations without Dependency Trees. Proceedings of the 55th Annual Meeting of the As-sociation for Computational Linguistics, 1. 917-928.
https://doi.org/10.18653/v1/P17-1085
[5] Zheng, S., Wang, F., Bao, H., Hao, Y., Zhou, P. and Xu, B. (2017) Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 1, 1227-1236.
https://doi.org/10.18653/v1/P17-1113
[6] Zhang, Y. and Yang, J. (2018) Chinese NER Using Lattice LSTM. The 56th Annual Meeting of the Association for Computational Linguistics (ACL), 1, 1554-1564.
https://doi.org/10.18653/v1/P18-1144
[7] Gui, T., Ma, R., Zhang, Q., Zhao, L. and Huang, X. (2019) CNN-Based Chinese NER with Lexicon Rethinking. Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, 4982-4988.
https://doi.org/10.24963/ijcai.2019/692
[8] Li, X., Yan, H., Qiu, X. and Huang, X. (2020) FLAT: Chinese NER Using Flat-Lattice Transformer. Proceedings of the 58th Annual Meeting of the Association for Computational Lin-guistics, 6836-6842.
https://doi.org/10.18653/v1/2020.acl-main.611
[9] Zeng, X., Zeng, D., He, S., Liu, K. and Zhao, J. (2018) Ex-tracting Relational Facts by an End-to-End Neural Model with Copy Mechanism. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 1, 506-514.
https://doi.org/10.18653/v1/P18-1047
[10] Wei, Z., Su, J., Wang, Y., Tian, Y. and Chang, Y. (2019) A Novel Hierarchical Binary Tagging Framework for Relational Triple Extraction. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 1476-1488.
https://doi.org/10.18653/v1/2020.acl-main.136
[11] Dai, Z., Yang, Z., Yang, Y., Carbonell, J. and Salakhutdinov, R. (2019) Transformer-xl: Attentive Language Models beyond a Fixed-Length Context. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2978-2988.
https://doi.org/10.18653/v1/P19-1285
[12] Sahu, S.K., Anand, A., Oruganty, K. and Gattu, M. (2016) Relation Extraction from Clinical Texts Using Domain Invariant Convolutional Neural Network. Proceedings of the 15th Workshop on Biomedical Natural Language Processing, Berlin, August 2016, 206-215.
https://doi.org/10.18653/v1/W16-2928
[13] Ramamoorthy, S. and Murugan, S. (2018) An Attentive Sequence Model for Adverse Drug Event Extraction from Biomedical Text.
[14] Giannis, B., Johannes, D., Thomas, D. and Chris, D. (2018) Joint Entity Recognition and Relation Extraction as a Multi-Head Selection Problem. Expert Systems with Application, 114, 34-45.
https://doi.org/10.1016/j.eswa.2018.07.032