地质灾害文本实体关系联合抽取研究
Research on Joint Extraction of Entity and Relation in Geological Hazard Text
DOI: 10.12677/CSA.2021.1111271, PDF,    科研立项经费支持
作者: 周榆婷, 陈平华:广东工业大学计算机学院,广东 广州;陈建平:肇庆学院计算机科学与软件学院,广东 肇庆
关键词: 命名实体识别关系抽取联合抽取标注模式Named Entity Recognition Relation Extraction Joint Extraction Labeling Mode
摘要: 针对传统实体关系联合抽取方法存在效率低下、错误传播、实体冗余等问题,提出基于双向长短时记忆神经网络和条件随机场并融合注意力机制的地质灾害实体与关系联合抽取方法。使用一种新标注方案,将地质灾害文本实体关系联合抽取问题转化为序列标注问题。用字符级嵌入进行作文本向量化表示,使用BiLSTM-Attention-CRF模型实现地质灾害文本实体关系联合抽取。实验结果表明:在地质灾害语料集上,实体识别的F-score值达到了85.4%,关系抽取的F-score达到了63.6%,证明了该方法的优越性和有效性。
Abstract: In view of the problems of low efficiency, error propagation, and entity redundancy in traditional entities and relations extraction method, this article proposes a joint geological hazard entities and relations extraction method based on bi-directional long short-term memory and conditional random fields with attention mechanism. This method employs a new tagging scheme which represents both entity and relation information by the tags and converts the joint extraction task to a tagging task. This method applies character embedding as input, and extracts geological hazard entities and relations with BiLSTM-Attention-CRF model. The results shows that, on geological hazard corpus, the method achieves 85.4% F-score for named entity recognition and 63.6% F-score for relations extraction which proves the superiority and effectiveness of the method.
文章引用:周榆婷, 陈平华, 陈建平. 地质灾害文本实体关系联合抽取研究[J]. 计算机科学与应用, 2021, 11(11): 2672-2681. https://doi.org/10.12677/CSA.2021.1111271

参考文献

[1] Zhang, R., Lu, W., Wang, S., Peng, X., Yu, R. and Gao, Y. (2020) Chinese Clinical Named Entity Recognition Based on Stacked Neural Network. Concurrency and Computation: Practice and Experience, 33, e5775. [Google Scholar] [CrossRef
[2] 何玉洁, 杜方, 史英杰, 宋丽娟. 基于深度学习的命名实体识别研究综述[J/OL]. 计算机工程与应用, 2021: 1-17. http://kns.cnki.net/kcms/detail/11.2127.TP.20210326.0937.002.html, 2021-04-13.
[3] 西尔艾力•色提, 艾山•吾买尔, 王路路, 等. 结合单词-字符引导注意力网络的中文旅游文本命名实体识别[J]. 计算机工程, 2021, 47(2): 39-45.
[4] Gao, S., Kotevska, O., Sorokine, A. and Christian, J.B. (2021) A Pre-Training and Self-Training Approach for Biomedical Named Entity Recognition. PLoS ONE, 16, e0246310. [Google Scholar] [CrossRef] [PubMed]
[5] Miwa, M. and Sasaki, Y. (2014) Modeling Joint Entity and Relation Extraction with Table Representation. Proc of the 19th Conf on Empirical Methods in Natural Language Pro-cessing, Doha, October 2014, 1858-1869. [Google Scholar] [CrossRef
[6] Miwa, M. and Bausal, M. (2016) End to-End Relation Extraction Using LSTMs on Sequences and Tree Structures. Proc of the 54th Association for Computational Linguistics, Berlin, August 2016, 1105-1116. [Google Scholar] [CrossRef
[7] Zheng, S., Wang, F., Bao, H., Hao, Y., Zhou, P. and Xu, B. (2017) Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, July 2017, 1227-1236. [Google Scholar] [CrossRef
[8] Lee, S.H., Goëau, H., Bonnet, P. and Joly, A. (2020) Attention-Based Recurrent Neural Network for Plant Disease Classification. Frontiers in Plant Science, 11, Article ID: 601250. [Google Scholar] [CrossRef] [PubMed]
[9] 张华丽, 康晓东, 李博, 王亚鸽, 刘汉卿, 白放. 结合注意力机制的Bi-LSTM-CRF中文电子病历命名实体识别[J]. 计算机应用, 2020, 40(S1): 98-102.
[10] Deng, N., Fu, H. and Chen, X. (2021) Named Entity Recognition of Traditional Chinese Medicine Patents Based on BiLSTM-CRF. Wireless Communications and Mobile Computing, 2021, Article ID: 6696205. [Google Scholar] [CrossRef
[11] Kadyan, V., Dua, M. and Dhiman, P. (2021) Enhancing Accuracy of Long Contextual Dependencies for Punjabi Speech Recognition System Using Deep LSTM. International Journal of Speech Technology, 24, 517-527. [Google Scholar] [CrossRef
[12] Hong, Y., Liu, Y., Yang, S., Zhang, K. and Hu, J. (2020) Joint Extraction of Entities and Relations Using Graph Convolution over Pruned Dependency Trees. Neurocomputing, 411, 302-312. [Google Scholar] [CrossRef
[13] Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K. and Dyer, Chris (2016) Neural Architectures for Named Entity Recognition. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, June 2016, 260-270. [Google Scholar] [CrossRef
[14] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., et al. (2017) Attention Is All You Need. 31st Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2019, 5998-6008.
[15] Wojek, C. and Schiele, B. (2008) A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes. European Conference on Computer Vision, Marseille, 12-18 October 2008, 733-747. [Google Scholar] [CrossRef
[16] Precheh, I. (1998) Automatic Early Stopping Using Cross Validation: Quantifying the Criteria. Neural Networks the Official Journal of the International Neural Network Society, 11, 761-767. [Google Scholar] [CrossRef