基于深度学习的铸件字符识别
Casting Character Recognition Based on Deep Learning
DOI: 10.12677/ORF.2023.132141, PDF,   
作者: 常 秀:兰州交通大学,光电技术与智能控制教育部重点实验室,甘肃 兰州
关键词: 工业智能不规则文本弯曲文本PGNet网络铸件字符识别Industrial Intelligence Irregular Text Curved Text PGNet Network Casting Character Recognition
摘要: 针对铸件字符人工识别效率低、人工记录易出错,现有的字符识别方法无法应对工业场景下复杂的铸件字符,且场景本身存在极端光照、遮挡、模糊的问题,提出了改进的PGNet网络。该网络在识别水平文本的同时,也能很好地识别弯曲文本和不规则文本。针对铸件字符数据量不足的情况,加入STN矫正模块进行数据增强,不同的实验结果表明,对准召的提升大于1%。此外,通过优化PGNet网络的损失函数,降低了误识别率。通过对PGNet网络的改进,在一定程度上解决了上述问题,使得铸件字符的溯源与管控过程更加准确和高效。
Abstract: In view of the low efficiency of casting character recognition, easy error in manual recording, the existing character recognition methods can not deal with the complex casting characters in industrial scenes, and the scene itself has extreme lighting, blocking, fuzzy problems, an im-proved network has been proposed. This network can recognize not only horizontal text but also curved text and irregular text. In view of the insufficient amount of casting character data, the correction module was added for data enhancement. Different experimental results show that the increase of alignment is greater than 1%. In addition, by optimizing the loss function of PGNet network, the misidentification rate is reduced. Through the improvement of PGNet network, the above prob-lems are solved to a certain extent, making the casting character tracing and control process more accurate and efficient.
文章引用:常秀. 基于深度学习的铸件字符识别[J]. 运筹与模糊学, 2023, 13(2): 1388-1400. https://doi.org/10.12677/ORF.2023.132141

参考文献

[1] Du, Y., Li, C., Guo, R., et al. (2020) Pp-Ocr: A Practical Ultra Lightweight Ocr System. arXiv:2009.09941.
[2] Neubeck, A. and Van Gool, L. (2006) Efficient Non-Maximum Suppression. 18th Inter-national Conference on Pattern Recognition (ICPR’06), 3, 850-855. [Google Scholar] [CrossRef
[3] Ding, J., Xue, N., Long, Y., et al. (2019) Learning Roi Trans-former for Oriented Object Detection in Aerial Images. Proceedings of the IEEE/CVF Conference on Computer Vi-sion and Pattern Recognition, Long Beach, CA, 15-20 June 2019, 2849-2858. [Google Scholar] [CrossRef
[4] Wang, P., Zhang, C., Qi, F., et al. (2021) Pgnet: Real-Time Arbitrarily-Shaped Text Spotting with Point Gathering Network. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 2782-2790. [Google Scholar] [CrossRef
[5] Cubuk, E.D., Zoph, B., Shlens, J., et al. (2020) Randaugment: Practical Automated Data Augmentation with a Reduced Search Space. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, 14-19 June 2020, 702-703. [Google Scholar] [CrossRef
[6] Long, J., Shelhamer, E. and Darrell, T. (2015) Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 7-12 June, 2015, 3431-3440. [Google Scholar] [CrossRef
[7] Watanabe, S., Hori, T., Kim, S., et al. (2017) Hybrid CTC/Attention Architecture for End-to-End Speech Recognition. IEEE Journal of Selected Topics in Signal Pro-cessing, 11, 1240-1253. [Google Scholar] [CrossRef
[8] Graves, A. and Graves, A. (2012) Connectionist Temporal Classification. In Supervised Sequence Labelling with Recurrent Neural Networks, Springer, Berlin, Heidelberg, Vol. 385, 61-93. [Google Scholar] [CrossRef
[9] 徐冰冰, 岑科廷, 黄俊杰, 等. 图卷积神经网络综述[J]. 计算机学报, 2020, 43(5): 755-780.
[10] 闫旭, 范晓亮, 郑传潘, 等. 基于图卷积神经网络的城市交通态势预测算法[J]. 浙江大学学报(工学版), 2020, 54(6): 1147-1155.
[11] 王建新, 王子亚, 田萱. 基于深度学习的自然场景文本检测与识别综述[J]. 软件学报, 2020, 31(5): 1465-1496.
[12] Jaderberg, M., Simonyan, K. and Zisserman, A. (2015) Spatial Transformer Networks. Advances in Neural Information Pro-cessing Systems, 28, 2017-2025.
[13] Wen, Y., Zhang, K., Li, Z., et al. (2016) A Discriminative Feature Learning Approach for Deep Face Recognition. 14th European Conference on Computer Vision (ECCV 2016), Amsterdam, The Netherlands, 11-14 October 2016, 499-515. [Google Scholar] [CrossRef