基于ALBERT的中文命名实体识别方法
Chinese Named Entity Recognition Method Based on ALBERT
DOI: 10.12677/CSA.2020.105091, PDF,  被引量    国家自然科学基金支持
作者: 邓博研*, 程良伦:广东工业大学,广东 广州
关键词: 中文命名实体识别ALBERT预训练语言模型BiLSTM条件随机场Chinese Named Entity Recognition ALBERT Pre-Trained Language Model BiLSTM CRF
摘要: 在中文命名实体识别任务中,BERT预训练语言模型因其良好的性能得到了广泛的应用,但由于参数量过大、训练时间长,其实际应用场景受限。针对这个问题,提出了一种基于ALBERT的中文命名实体识别模型ALBERT-BiLSTM-CRF。在结构上,先通过ALBERT预训练语言模型在大规模文本上训练字符级别的词嵌入,然后将其输入BiLSTM模型以获取更多的字符间依赖,最后通过CRF进行解码并提取出相应实体。该模型结合ALBERT与BiLSTM-CRF模型的优势对中文实体进行识别,在MSRA数据集上达到了95.22%的F1值。实验表明,在大幅削减预训练参数的同时,该模型保留了相对良好的性能,并具有很好的可扩展性。
Abstract: The BERT pre-trained language model has been widely used in Chinese named entity recognition due to its good performance, but the large number of parameters and long training time has limited its practical application scenarios. In order to solve these problems, we propose ALBERT-BiLSTM-CRF, a model for Chinese named entity recognition task based on ALBERT. Structurally, the model firstly trains character-level word embeddings on large-scale text through the ALBERT pre-training language model, and then inputs the word embeddings into the BiLSTM model to obtain more inter-character dependencies, and finally decodes through CRF and extracts the corresponding entities. This model combines the advantages of ALBERT and BiLSTM-CRF models to identify Chinese entities, and achieves an F1 value of 95.22% on the MSRA dataset. Experiments show that while greatly reducing the pre-training parameters, the model retains relatively good performance and has good scalability.
文章引用:邓博研, 程良伦. 基于ALBERT的中文命名实体识别方法[J]. 计算机科学与应用, 2020, 10(5): 883-892. https://doi.org/10.12677/CSA.2020.105091

参考文献

[1] Lafferty, J., McCallum, A. and Pereira, F.C.N. (2001) Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data.
[2] Jozefowicz, R., Zaremba, W. and Sutskever, I. (2015) An Empirical Exploration of Recurrent Network Architectures. Proceedings of the 32nd International Conference on Machine Learning, 37, 2342-2350.
[3] Mikolov, T., Chen, K., Corrado, G., et al. (2013) Efficient Estimation of Word Representations in Vector Space. arXiv Preprint arXiv:1301.3781.
[4] Devlin, J., Chang, M.W., Lee, K., et al. (2018) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv Preprint arXiv:1810.04805.
[5] Lan, Z., Chen, M., Goodman, S., et al. (2019) Albert: A Lite Bert for Self-Supervised Learning of Language Representations. arXiv Preprint arXiv:1909.11942.
[6] Dong, C., Zhang, J., Zong, C., et al. (2016) Character-Based LSTM-CRF with Radi-cal-Level Features for Chinese Named Entity Recognition. In: Lin, C.-Y., Xue, N.W., Zhao, D.Y., Huang, X.J. and Feng, Y.S., Eds., Natural Language Understanding and Intelligent Applications, Springer, Cham, 239-250. [Google Scholar] [CrossRef
[7] Xiang, Y. (2017) Chinese Named Entity Recognition with Character-Word Mixed Embedding. Proceedings of the 2017 ACM on Conference on Information and Knowledge Man-agement, Singapore, November 2017: 2055-2058.
[8] Zhang, Y. and Yang, J. (2018) Chinese NER Using Lattice LSTM. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne. [Google Scholar] [CrossRef
[9] Xu, C., Wang, F., Han, J., et al. (2019) Exploiting Multiple Embed-dings for Chinese Named Entity Recognition. The 28th ACM International Conference on Information and Knowledge Management, Beijing, November 2019, 2269-2272. [Google Scholar] [CrossRef
[10] Johnson, S., Shen, S. and Liu, Y. (2020) CWPC_BiAtt: Charac-ter-Word-Position Combined BiLSTM-Attention for Chinese Named Entity Recognition. Information, 11, 45. [Google Scholar] [CrossRef
[11] Dai, Z., Wang, X., Ni, P., et al. (2019) Named Entity Recognition Using BERT BiLSTM CRF for Chinese Electronic Health Records. 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Suzhou, 19-21 October 2019, 1-5. [Google Scholar] [CrossRef
[12] Jiang, S., Zhao, S., Hou, K., et al. (2019) A BERT-BiLSTM-CRF Model for Chinese Electronic Medical Records Named Entity Recognition. 2019 12th Internation-al Conference on Intelligent Computation Technology and Automation (ICICTA), Xiangtan, 26-27 October 2019, 166-169.
[13] Cai, Q. () Research on Chinese Naming Recognition Model Based on BERT Embedding. 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS), Beijing, 18-20 October 2019, 1-4. [Google Scholar] [CrossRef
[14] Gong, C., Tang, J., Zhou, S., et al. (2019) Chinese Named Entity Recognition with Bert. 2019 International Conference on Computer Intelligent Systems and Network Re-mote Control (CISNRC), Shanghai, 29-30 December 2019, 8-15. [Google Scholar] [CrossRef
[15] Cui, Y., Che, W., Liu, T., et al. (2019) Pre-Training with Whole Word Masking for Chinese Bert. arXiv Preprint arXiv:1906.08101.
[16] Michel, P., Levy, O. and Neubig, G. (2019) Are Sixteen Heads Really Better than One? arXiv:1905.10650.
[17] Wang, Z., Wohlwend, J. and Lei, T. (2019) Structured Pruning of Large Language Models. arXiv Preprint arXiv:1910.04732.
[18] Shen, S., Dong, Z., Ye, J., et al. (2019) Q-Bert: Hessian Based Ultra Low Precision Quantization of Bert. arXiv Preprint arXiv:1909.05840.
[19] Radford, A., Narasimhan, K., Salimans, T., et al. (2018) Improving Language Understanding by Generative Pre-Training.
[20] Peters, M.E., Neumann, M., Iyyer, M., et al. (2018) Deep Contextualized Word Rep-resentations. arXiv Preprint arXiv:1802.05365.
[21] Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Advances in Neural Information Processing Systems, 30, 5998-6008.