基于改进EAST的文本检测算法
Text Detection Algorithm Based on Improved EAST
DOI: 10.12677/CSA.2021.111018, PDF,    国家自然科学基金支持
作者: 王 俊, 苗 军*:北京信息科技大学,网络文化与数字传播北京市重点实验室,北京;卿来云:中国科学院大学,计算机科学与技术学院,北京;乔元华:北京工业大学,数理学院,北京
关键词: 文本识别EASTASPP网络BLSTM神经网络Text Recognition EAST ASPP Network BLSTM Neural Network
摘要: 自然场景文本定位检测是文本识别的研究热点之一。EAST算法是目前自然场景文本定位检测算法较为出色的算法之一,在ICDAR2015数据集上,有着较高的准确率和召回率。但EAST算法仍存在着感受野不够大、长文本检测效果不佳的问题。因此本实验对EAST算法进行改进,通过改进EAST算法的结构,加入了ASPP网络,扩大感受野,加入了BLSTM神经网络,增强了文本之间的关联,提高文本定位效果。实验结果表明,该算法在ICDAR2015文本定位任务上的召回率为77.84%,精确率为86.24%,F-score为81.82%,优于经典EAST算法。
Abstract: Text location and detection in natural scenes is one of the research hotspots of text recognition. East algorithm is one of the most excellent algorithms for text location and detection in natural scenes. It has high accuracy and recall rate in ICDAR2015 Dataset. However, the sensitivity field of EAST is not large enough and the effect of long text detection is not good. Therefore, this experiment improves the EAST algorithm by improving the structure of the EAST algorithm, adding the ASPP network, expanding the receptive field, adding the BLSTM neural network, enhancing the relevance between texts, and improving the text location effect. Experimental results show that the recall rate, precision rate and F-score of ICDAR2015 are 77.84%, 86.24% and 81.82% respectively, which are better than the classical EAST algorithm.
文章引用:王俊, 苗军, 卿来云, 乔元华. 基于改进EAST的文本检测算法[J]. 计算机科学与应用, 2021, 11(1): 167-175. https://doi.org/10.12677/CSA.2021.111018

参考文献

[1] Mori, S. (1992) Historical Review of OCR Research and Development. Proceedings of the IEEE, 80, 1029-1058. [Google Scholar] [CrossRef
[2] Goodfellow, I.J., Bulatov, Y., Ibarz, J., et al. (2013) Multi-Digit Number Recognition from Street View Imagery Using Deep Convolutional Neural Networks. Computer Science.
[3] Graves, A., Fern’andez, S., Gomez, F., et al. (2006) Labelling Unsegmented Sequence Data with Recurrent Neural Networks. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, 25-29 June 2006, 369-376.
[4] Tian, Z., Huang, W., He, T., et al. (2016) Detecting Text in Natural Image with Connectionist Text Pro-posal Network. European Conference on Computer Vision, Amsterdam, 11-14 October 2016, 56-72. [Google Scholar] [CrossRef
[5] Shi, B., Bai, X. and Belongie, S. (2017) Detecting Oriented Text in Natural Images by Linking Segments. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 3482-3490. [Google Scholar] [CrossRef
[6] Zhou, X.-Y., Yao, C., Wen, H., et al. (2017) EAST: An Efficient and Accurate Scene Text Detector. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 2642-2651. [Google Scholar] [CrossRef
[7] He, Y., Zhu, C., Wang, J., et al. (2019) Bounding Box Regression with Uncertainty for Accurate Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 2888-2897. [Google Scholar] [CrossRef
[8] Hu, H., Zhang, C., Luo, Y., et al. (2017) WordSup: Exploiting Word Annotations for Character Based Text Detection. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 4950-4959. [Google Scholar] [CrossRef
[9] Redmon, J., Divwvala, S., Girshick, R., et al. (2016) You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[10] He, K., Zhang, X., Ren, S., et al. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[11] 杨飚, 杜晓宇. 基于改进EAST的自然场景文本定位算法[J]. 计算机工程与应用, 2019, 55(18): 161-165.
[12] 池凯, 赵逢禹. 改进EAST算法的游戏场景文本检测[J]. 小型微型计算机系统, 2020, 41(10): 2189-2193.
[13] Yang, P., Rong, G., Peng, G., et al. (2011) Research on Lip Detection Based on Opencv. 2011 International Conference on Transportation, Mechanical and Electrical Engineering (TMEE), Changchun, 16-18 December 2011, 1465-1468.
[14] Zhi, T., Huang, W., Tong, H., et al. (2016) Detecting Text in Nat-ural Image with Connectionist Text Proposal Network. In: European Conference on Computer Vision (ECCV), Springer, Cham, 56-72. [Google Scholar] [CrossRef
[15] Deng, D., Liu, H., Li, X., et al. (2018) PixelLink: Detecting Scene Text via Instance Segmentation. 2018 the Association for the Advance of Artificial Intelligence (AAAI), New Orle-ans, 2-7 February 2018, 6773-6780.