基于深度学习YOLOv2算法的钢材压印字符识别研究
Research on Character Recognition of Steel Embossing Based on YOLOv2
DOI: 10.12677/CSA.2020.101014, PDF,  被引量    国家自然科学基金支持
作者: 黄慧宁, 黄 菊, 梁 婵:广西大学计算机与电子信息学院,广西 南宁;张学军*:广西大学计算机与电子信息学院,广西 南宁;广西多媒体通信与网络技术重点实验室,广西 南宁;孙映华:广西大白小黑智能机器人有限公司,广西 南宁
关键词: 深度学习字符识别YOLOv2目标检测图像处理Deep Learning Character Recognition YOLOv2 Target Detection Image Processing
摘要: 针对工业生产钢材部件上压印字符与背景区域同色和光照不均影响,传统计算机视觉算法识别钢印字符存在效率与精度不佳的问题,本研究提出一种基于YOLOv2的钢材压印字符识别方法。通过一些基本的图像预处理方式扩充钢印字符数据集,采用快速可靠的深度学习算法YOLOv2自动提取图像的特征,实现对钢印字符(包括数字和字母)的识别。相较于其他传统的图像识别算法,实验结果表明,该网络模型对钢印字符识别的准确率达98.6%,算法平均处理时间为0.3 s,达到了工程应用的精度和效率要求。此外,利用字符位置信息对模型的输出进行改进,实现直接输出正确的生产标号。在工业生产环境下具有较好的稳定性和实时性,有一定的应用意义。
Abstract: Aiming at the influence of the same color as the background area of industrial parts and uneven illumination of the steel embossing characters, there is a problem of poor efficiency and precision of traditional computer vision algorithms to identify steel embossing characters. This research proposes a steel embossing character recognition method based on YOLOv2. Through some basic image preprocessing methods, the steel embossing character data set is expanded, and the fast and reliable deep learning algorithm YOLOv2 is used to automatically extract the features of the image to realize the recognition of the steel embossing characters (including numbers and letters). Compared with other traditional image recognition algorithms, the experimental results show that the accuracy of the network model for the identification of steel embossing characters is 98.6%, and the average processing time of the algorithm is 0.3 s, which meets the accuracy and efficiency requirements of engineering applications. In addition, the output of the model is improved by using the character position information, and the correct production label can be output directly. It has good stability and real-time performance in industrial production environment and has certain application significance.
文章引用:黄慧宁, 张学军, 黄菊, 梁婵, 孙映华. 基于深度学习YOLOv2算法的钢材压印字符识别研究[J]. 计算机科学与应用, 2020, 10(1): 126-135. https://doi.org/10.12677/CSA.2020.101014

参考文献

[1] 顾晨勤, 葛万成. 基于模板匹配算法的字符识别研究[J]. 通信技术, 2009, 42(3): 220-222.
[2] Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B. and Belongie, S. (2017) Feature Pyramid Networks for Object Detection. Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef
[3] Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., et al. (2009) Ob-ject Detection with Discriminatively Trained Part-Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1627-1645. [Google Scholar] [CrossRef
[4] Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recogni-tion, San Diego, 20-25 June 2005, 886-893.[CrossRef
[5] Cortes, C. and Vapnik, V. (1995) Support-Vector Networks. Ma-chine Learning, 20, 273-297. [Google Scholar] [CrossRef
[6] Zhu, Q., Yeh, M.-C., Cheng, K.-T. and Avidan, S. (2006) Fast Human Detection Using a Cascade of Histograms of Oriented Gradients. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, 17-22 June 2006, 1491-1498. [Google Scholar] [CrossRef
[7] Wang, X., Yang, M., Zhu, S. and Lin, Y. (2013) Regionlets for Ge-neric Object Detection. Proceedings of the IEEE International Conference on Computer Vision, Sydney, 1-8 December 2013, 17-24. [Google Scholar] [CrossRef
[8] Azizpour, H. and Laptev, I. (2012) Object Detection Using Strong-ly-Supervised Deformable Part Models. Proceedings of the European Conference on Computer Vision, Florence, 7-13 October 2012, 836-849. [Google Scholar] [CrossRef
[9] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef
[10] Uijlings, J.R., Van De Sande, K.E., Gevers, T., et al. (2013) Selective Search for Object Recognition. International Journal of Computer Vision, 104, 154-171. [Google Scholar] [CrossRef
[11] Girshick, R. (2015) Fast R-CNN. Proceedings of the IEEE Inter-national Conference on Computer Vision, Santiago, 7-13 December 2015, 1440-1448. [Google Scholar] [CrossRef
[12] Ren, S.Q., He, K.M., Girshick, R., et al. (2015) Faster R-CNN: To-wards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems, 28, 91-99.
[13] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[14] Redmon, J. and Farhadi, A. (2017) YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 6517-6525. [Google Scholar] [CrossRef
[15] Nodes, T. and Gallagher, N.C.J. (1982) Median Filters: Some Modi-fications and Their Properties. IEEE Transactions on Acoustics, Speech, and Signal Processing, 30, 739-746. [Google Scholar] [CrossRef
[16] Pérez, P., Gangnet, M. and Blake, A. (2003) Poisson Image Editing. ACM Transactions on Graphics, 22, 313-318. [Google Scholar] [CrossRef
[17] Simonyan, K. and Zisserman, A. (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition. International Conference on Learning Representations, San Diego, 7-9 May 2015, 1-14.
[18] Szegedy, C., Liu, W., Jia, Y., et al. (2015) Going Deeper with Convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 1-9. [Google Scholar] [CrossRef
[19] Lin, M., Chen, Q. and Yan, S.C. (2014) Network in Net-work.
[20] Lecun, Y., Bottou, L., Bengio, Y., et al. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324. [Google Scholar] [CrossRef