基于卷积神经网络的文本框识别算法在电力业务系统上的应用研究

doi:10.12677/AIRR.2023.123024

期刊菜单

基于卷积神经网络的文本框识别算法在电力业务系统上的应用研究
Research on the Application of Text Box Recognition Algorithm Based on Convolutional Neural Network in Power Service System

DOI: 10.12677/AIRR.2023.123024, PDF, 被引量科研立项经费支持
作者: 刘翠媚, 吴毅良^*, 郭凤婵, 罗序良, 陆庭辉：广东电网有限责任公司江门供电局，广东江门
关键词: 卷积神经网络；文本框识别；辅助录入；信息系统；人工智能；Convolutional Neural Network； Text Box Recognition； Auxiliary Input； Information System； Artificial Intelligence

摘要: 针对在电力行业上业务办理终端信息录入效率低的问题，提出一种基于卷积神经网络(CNN)的文本框识别算法。采用Faster RCNN网络对文本框数据集进行训练与验证，结合OCR技术开发辅助录入系统。通过引入基于CNN的文本框识别算法，兼容不同系统的业务终端应用，在不改变原系统架构的情况下，提高了算法的适用性。实验结果表明，基于CNN的文本框识别算法应用于辅助录入系统上，相对于人工录入方式在信息录入速度与准确性有显著提升，在电力行业的业务办理终端上具有广泛应用前景。

Abstract: A text box recognition algorithm based on a convolutional neural network (CNN) is proposed to address the low efficiency in information input of business terminals in the power industry. A Faster RCNN network is used to train and validate the text box dataset, and combined with OCR technology to develop an auxiliary input system. By introducing a CNN-based text box recognition algorithm, the algorithm’s applicability is improved for business terminal applications across different systems without changing the original system architecture. Experimental results show that the CNN-based text box recognition algorithm applied to the auxiliary input system significantly improves information input speed and accuracy compared to manual input methods and has broad application prospects in business terminals in the power industry.

文章引用：刘翠媚, 吴毅良, 郭凤婵, 罗序良, 陆庭辉. 基于卷积神经网络的文本框识别算法在电力业务系统上的应用研究[J]. 人工智能与机器人研究, 2023, 12(3): 209-218. https://doi.org/10.12677/AIRR.2023.123024

参考文献

[1]	李光飞, 王德东, 楼然苗. 教务成绩辅助录入系统的设计与实现[J]. 电脑知识与技术, 2015, 11(32): 60-61, 68.
[2]	Srivastava, S., Verma, A. and Sharma, S. (2022) Optical Character Recognition Techniques: A Review. 2022 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, 19-20 February 2022, 1-6. [Google Scholar] [CrossRef]
[3]	Ma, X.Y., Liu, P.F. and Liu, H.P. (2018) A Novel Two-Step Scheme for OCR Post-Processing in Document Analysis. International Joint Conferences on Artificial Intelligence (IJCAI), Stockholm, 13-19 July 2018, 4808-4814.
[4]	Hu, B., Huang, L., Yang, L. and Yuan, X.H. (2020) An End-to-End Framework for Chinese Document OCR Based on Deep CNN-LSTM Networks. International Conference on Pattern Recognition Applications and Methods (ICPRAM), Valletta, 22-24 February 2020, 651-657.
[5]	El-Shishtawy, T., Mahmoud, N.E.M., Khedr, A.E. and Abdou, A.S. (2017) Speech Recognition System for Arabic Medical Reports. Journal of Medical Imaging and Health Informatics, 7, 788-795.
[6]	Mahmoud, A.O., Nassar, M.M. and El-Bendary, M.A. (2018) A Speech-Based Interactive E-Book Authoring Tool for the Visually Impaired. Universal Access in the Information Society, 17, 211-227.
[7]	Liu, B., Wang, L.L., Mi, R., Yan, G.J. and Zhou, X. (2021) An Intelligent System for Medical Record Management Based on Speech Recognition and NLP Techniques. Journal of Ambient Intelligence and Humanized Computing, 12, 739-750.
[8]	Bluche, T., Nakazawa, A. and Dupoux, E. (2015) Joint Sequence Models for Grapheme-to-Phoneme Conversion. INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, 6-10 September 2015, 2527-2531.
[9]	Gao, S. and Zhou, J.Y. (2016) Handwritten Digit Recognition with a Large-Scale Unconstrained Dataset. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 3596-3603.
[10]	Ravi, S. and Kozareva, Z. (2018) Self-Governing Neural Networks for On-Device Short Text Classification. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, October-November 2018, 804-810. [Google Scholar] [CrossRef]
[11]	Ma, M., Huang, L., Xiang, B. and Zhou, B. (2015) Dependency-Based Convolutional Neural Networks for Sentence Embedding. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, July 2015, 174-179. [Google Scholar] [CrossRef]
[12]	Liu, X., He, P., Chen, W. and Gao, J. (2019) Multi-Task Deep Neural Networks For Natural Language Understanding. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, 28 July-2 August 2019, 4487-4496. [Google Scholar] [CrossRef]
[13]	Xiao, L., Huang, X., Chen, B. and Jing, L. (2019) Label-Specific Document Representation for Multi-Label Text Classification. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, November 2019, 466-475. [Google Scholar] [CrossRef]
[14]	Krizhevsky, A., Sutskever, I. and Hinton, G. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 25, 1097-1105.
[15]	Ren, S., He, K., Girshick, R., et al. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef]
[16]	Srivastava, N., Hinton, G., Krizhevsky, A., et al. (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15, 1929-1958.

为你推荐

友情链接