改进YOLOv5的中国交通标志检测算法
Chinese Traffic Sign Detection Based on YOLOv5 Algorithm
DOI: 10.12677/ORF.2022.124165, PDF,    国家自然科学基金支持
作者: 罗开涛, 何清龙:贵州大学数学与统计学院,贵州 贵阳
关键词: YOLOv5中国交通标志小目标检测注意力机制YOLOv5 Chinese Traffic Sign Small Object Detection Attention Mechanism
摘要: 针对实际复杂道路环境下,交通标志像素小而密集导致的检测定位精度较低,较高漏检的问题,本文提出了一种改进YOLOv5的中国交通标志检测算法。通过遗传算法和K-means算法优化先验锚点框来提高小目标的检测准确度;引入Bi-FPN结构对语义信息和位置信息进行特征融合,增强网络表征能力;增加注意力机制GAM,提升网络对于复杂环境下的抗干扰能力和特征提取能力;改变边框回归损失函数SIoU loss,实现模型的快速收敛和准确定位。实验结果表明,与标准的YOLOv5算法相比,本文改进的算法的平均精度均值mAP、精确度P、召回率、F1提升值分别为10.03个百分点、4.7个百分点、2.6个百分点和3.48个百分点,验证了该算法对中国交通标志检测的有效性。FPS达到了208.33FPS,满足实时性检测要求。
Abstract: In order to solve low accuracy and missing detection of traffic signs in sign recognition task, caused by complex road environment, small and dense signs filled of whole picture, we propose an improved Yolov5 algorithm. Firstly, we use genetic algorithm and K-means algorithm to get perfect anchors fit for the dataset which is good for the small objects. Secondly, we introduce the Bi-FPN structure to the neck of mode to combine high level semantical features and low-level features, which improves the representational capacity of the network. Thirdly, embedding the Gam attention into the backbone and neck could enhance feature extraction ability and anti-interference ability to messy background. Furthermore, a modified SIoU loss function is ap-plied to optimize the training process and improve accuracy. Experimental results show that the mean average precision (mAP), precision, recall and F1 score are improved by 10.03%, 4.7%, 2.6 % and 3.48 % compared with the benchmark YOLOv5 algorithm respectively, which shows that the proposed algorithm is effective for Chinese traffic sign detection. In addition, 208.33 FPS could meet the real-time requirement.
文章引用:罗开涛, 何清龙. 改进YOLOv5的中国交通标志检测算法[J]. 运筹与模糊学, 2022, 12(4): 1570-1584. https://doi.org/10.12677/ORF.2022.124165

参考文献

[1] 孙颖, 葛平淑, 刘德全. 交通标志的检测与识别研究综述[J]. 大连民族大学学报, 2019, 21(5): 412-417.
[2] 陈飞, 刘云鹏, 李思远. 复杂环境下的交通标志检测与识别方法综述[J]. 计算机工程与应用, 2021, 57(16): 65-73.
[3] 伍晓晖, 田启川. 交通标志识别方法综述[J]. 计算机工程与应用, 2020, 56(10): 20-26.
[4] 马健, 张敏, 张丽岩, 段晓科. 交通标志识别系统研究综述[J]. 物流科技, 2021, 44(10): 69-74. [Google Scholar] [CrossRef
[5] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Confer-ence on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef
[6] Girshick, R. (2015) Fast R-CNN. 2015 IEEE International Conference on Computer Vision, Santiago, 7-13 December 2015, 1440-1448. [Google Scholar] [CrossRef
[7] Ren, S., He, K., Girshick, R. and Sun, J. (2015) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef
[8] Liu, W., Anguelov, D., Erhan, D., et al. (2016) SSD: Single Shot Multibox Detector. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Computer Vision—ECCV 2016. Lecture Notes in Computer Science, Springer, Cham, 21-37. [Google Scholar] [CrossRef
[9] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[10] Redmon, J. and Farhadi, A. (2017) YOLO9000: Better, Faster, Stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 6517-6525. [Google Scholar] [CrossRef
[11] Redmon, J. and Farhadi, A. (2018) Yolov3: An Incremental Improvement. ArXiv: 1804.02767.
[12] Bochkovskiy, A., Wang, C.Y. and Liao, H.Y.M. (2020) Yolov4: Optimal Speed and Accuracy of Object Detection. ArXiv: 2004.10934.
[13] 李旭东, 张建明, 谢志鹏, 王进. 基于三尺度嵌套残差结构的交通标志快速检测算法[J]. 计算机研究与发展, 2020, 57(5): 1022-1036.
[14] 鲍敬源, 薛榕刚. 基于YOLOv3模型压缩的交通标志实时检测算法[J]. 计算机工程与应用, 2020, 56(23): 202-210.
[15] 刘万军, 李嘉欣, 曲海成. 基于多尺度卷积神经网络的交通标示识别研究[J]. 计算机应用研究, 2022, 39(5): 1557-1562. [Google Scholar] [CrossRef
[16] Zhang, J., Zou, X., Kuang, L.-D., Wang, J., Sherratt, R.S. and Yu, X. (2022) CCTSDB 2021: A More Comprehensive Traffic Sign Detection Benchmark. Human-Centric Computing and Information Sciences, 12, Article No. 23. [Google Scholar] [CrossRef
[17] Wang, C.Y., Liao, H.Y.M., Wu, Y.H., et al. (2020) CSPNet: A New Backbone That Can Enhance Learning Capability of CNN. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, 14-19 June 2020, 390-391. [Google Scholar] [CrossRef
[18] He, K., Zhang, X., Ren, S. and Sun, J. (2015) Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904-1916. [Google Scholar] [CrossRef
[19] Lin, T.Y., Dollár, P., Girshick, R., et al. (2017) Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef
[20] Liu, S., Qi, L., Qin, H., et al. (2018) Path Aggregation Network for Instance Segmentation. 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8759-8768. [Google Scholar] [CrossRef
[21] Tan, M., Pang, R. and Le, Q.V. (2020) Efficientdet: Scalable and Efficient Object Detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 13-19 June 2020, 10778-10787. [Google Scholar] [CrossRef
[22] Liu, Y., Shao, Z. and Hoffmann, N. (2021) Global At-tention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. ArXiv: 2112.05561.
[23] Woo, S., Park, J., Lee, J.Y. and Kweon, I.S. (2018) CBAM: Convolutional Block Attention Module. Proceedings of the Eu-ropean Conference on Computer Vision (ECCV), Cham, 3-19. [Google Scholar] [CrossRef
[24] Zheng, Z., Wang, P., Ren, D., et al. (2021) Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. IEEE Trans-actions on Cybernetics, 52, 8574-8586. [Google Scholar] [CrossRef
[25] Rezatofighi, H., Tsoi, N., Gwak, J.Y., et al. (2019) Gener-alized Intersection over Union: A Metric and a Loss for Bounding Box Regression. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 658-666. [Google Scholar] [CrossRef
[26] Gevorgyan, Z. (2022) SIoU Loss: More Powerful Learning for Bounding Box Regression. ArXiv: 2205.12740.