基于对抗生成网络的命名实体识别
Named Entity Recognition Based on Adversarial Generative Network
DOI: 10.12677/CSA.2020.102021, PDF,    国家自然科学基金支持
作者: 蔡金晔, 李 超, 邱 钊*:海南大学计算机与网络空间安全学院,海南 海口;黄向生:中国科学院自动化研究所,北京
关键词: 信息检索生成对抗网络命名实体识别特征提取Information Retrieval Generative Adversarial Network Named Entity Recognition Feature Extraction
摘要: 随着时间的流逝,互联网技术的迅速发展,如何从大量的文本数据中获取对我们有用的信息成为了一种新的挑战。命名实体识别是信息抽取和信息检索中一项重要的任务,其目的是识别出文本中表示命名实体的成分,并对其进行分类。它在垃圾邮件过滤,舆论分析和邮件分类等许多领域中广泛使用并发挥重要作用。考虑到采用对抗生成网络的形式能够更好地学习到样本数据的特征分布和采用变分自动编码器能够更好地接近真实样本的优点,本文通过对抗网络的形式,将两种现阶段具有各自特点的先进模型进行了结合,综合设计了一种基于对抗生成网络的命名实体识别算法模型用以提高特征提取的准确有效性。
Abstract: With the passage of time and the rapid development of Internet technology, how to obtain useful information from a large amount of text data has become a new challenge. Named entity recognition is an important task in information extraction and information retrieval, and its purpose is to identify and classify the components that represent named entities in the text. It is widely used and plays an important role in many fields such as spam filtering, opinion analysis and message classification. Taking into account the advantages of using adversarial generation networks to better learn the feature distribution of sample data and the advantages of using variational autoencoders to better approximate real samples, this article uses the form of adversarial networks to separate the two current stages. The advanced models with characteristics are combined to design a named entity recognition algorithm model based on adversarial generative network to improve the accuracy and effectiveness of feature extraction.
文章引用:蔡金晔, 李超, 邱钊, 黄向生. 基于对抗生成网络的命名实体识别[J]. 计算机科学与应用, 2020, 10(2): 200-207. https://doi.org/10.12677/CSA.2020.102021

参考文献

[1] 刘浏, 王东波. 命名实体识别研究综述[J]. 情报学报, 2018, 37(3): 329-340.
[2] 王子牛, 姜猛, 高建瓴, 陈娅先. 基于BERT的中文命名实体识别方法[J]. 计算机科学, 2019, 46(S2): 138-142.
[3] Zhao, H. and Kit, C. (2007) Unsupervised Segmentation Helps Supervised Learning of Character Tagging for Word Segmentation and Named Entity Recognition. Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing.
[4] Huang, Z., Xu, W. and Yu, K. (2015) Bidirectional LSTM-CRF Models for Sequence Tagging. Computer Science. arXivpreprint arXiv:1508.0199
[5] Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al. (2014) Generative Adversarial Nets. Proceedings of Neural Information Processing System, Curran Associates, New York, 2672-2680.
[6] Yu, L., Zhang, W., Wang, J., et al. (2016) SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient.
[7] Arjovsky, M., Chintala, S. and Bottou, L. (2017) Wasserstein GAN.
[8] Mirza, M. and Osindero, S. (2014) Conditional Generative Adversarial Nets. arXiv preprint arXiv:1411.1784.
[9] 张晗, 郭渊博, 李涛. 结合GAN与BiLSTM-Attention-CRF的领域命名实体识别[J]. 计算机研究与发展, 2019, 56(9): 1851-1858.
[10] Xu, W., Sun, H., Deng, C., et al. (2016) Variational Autoencoders for Semi-Supervised Text Classification.
[11] 冯建周, 马祥聪, 刘亚坤, 等. 关于命名实体识别的生成式对抗网络的研究[J]. 小型微型计算机系统, 2019, 40(6): 1191-1196.
[12] 单义栋, 王衡军, 黄河, 闫倩. 基于注意力机制的命名实体识别模型研究——以军事文本为例[J]. 计算机科学, 2019, 46(S1): 111-114+119.