基于改进YOLOv5s的交通信号灯检测算法
Improved YOLOv5s-Based Traffic Signal Detection Algorithm
摘要: 针对交通信号灯检测在背景环境复杂、小目标检测困难和实时性要求高的问题,基于YOLOv5s提出一种YL-YOLOv5s方法用于交通信号灯检测。首先,将Neck特征融合网络中残差结构替换为多层次跨通道DenseNet连接方式,以增强特征提取能力与减少梯度消失问题;然后,将DenseNet中的卷积操作替换为倒残差结构MobileNetV2,以加快推理速度并降低内存占用;最后在模块中嵌入通道注意力机制ECA (Efficient Channel Attention),有效建立了通道之间的相关性,从而提高了交通信号灯检测的准确率。实验结果表明,在法国巴黎信号灯LaRa和博世小型交通信号灯BSTLD上进行测试,与YOLOv5s算法相比,mAP@0.5:0.95、mAP@0.5等关键指标都有提升,在LaRa数据上mAP@0.5:0.95提升8.5%,并且模型计算量减少了23.2%,说明YL-YOLOv5s模型能够有效提升复杂环境下小目标的检测精度,且轻量化效果明显。
Abstract: In order to address the challenges of complex background environments, difficult detection of small targets, and high real-time requirements in traffic signal detection, this paper proposes a YL-YOLOv5s network based on the YOLOv5s architecture. Firstly, the residual structures in the Neck feature fusion network are replaced with a multi-level cross-channel DenseNet connection to en-hance feature extraction capabilities and reduce the problem of gradient vanishing. Next, the con-volution operations in DenseNet are replaced with inverted residual structures from MobileNetV2 to accelerate inference speed and reduce memory usage. Finally, an Efficient Channel Attention (ECA) mechanism is embedded in the modules to effectively establish correlations between chan-nels, thereby improving the accuracy of traffic signal detection. Experimental results show that when tested on the LaRa signal lights in Paris, France, and the Bosch Small Traffic Lights Dataset (BSTLD), the proposed YL-YOLOv5s model achieves improvements in key indicators such as mAP@0.5:0.95 and mAP@0.5 compared to the YOLOv5s algorithm. Specifically, it achieves an 8.5% increase in mAP@0.5:0.95 on the LaRa dataset, while reducing the computational cost by 23.2%. This demonstrates that the YL-YOLOv5s model effectively improves the detection accuracy of small targets in complex environments and exhibits significant lightweight effects.
文章引用:黄寅杰, 王福元, 肖海宁, 王仲楼. 基于改进YOLOv5s的交通信号灯检测算法[J]. 建模与仿真, 2023, 12(6): 5860-5874. https://doi.org/10.12677/MOS.2023.126532

参考文献

[1] Girshick, R. (2015) Fast R-CNN. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 1440-1448. [Google Scholar] [CrossRef
[2] Cheng, B., Wei, Y., et al. (2018) Revisiting RCNN: On Awakening the Classification Power of Faster RCNN. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Com-puter Vision—ECCV 2018, Springer, Cham, 453-468. [Google Scholar] [CrossRef
[3] Cheng, B., Wei, Y., Shi, H., et al. (2018) Revisiting RCNN: On Awakening the Classification Power of Faster RCNN. Computer Vision—ECCV 2018, Vol. 11219, Springer, Cham, 473-490. [Google Scholar] [CrossRef
[4] He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2017) Mask R-CNN. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2980-2988. [Google Scholar] [CrossRef
[5] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[6] Liu, W., Anguelov, D., Erhan, D., et al. (2016) SSD: Single Shot Multibox Detector. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Computer Vi-sion—ECCV 2016, Springer, Cham, 21-37. [Google Scholar] [CrossRef
[7] Lin, T.Y., Goyal, P., Girshick, R., He, K.M. and Dollár, P. (2017) Focal Loss for Dense Object Detection. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 Octo-ber 2017, 2999-3007. [Google Scholar] [CrossRef
[8] Asha, C.S. and Narasimhadhan, A.V. (2018) Vehicle Counting for Traffic Management System Using YOLO and Correlation Filter. 2018 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), Bangalore, 16-17 March 2018, 1-6. [Google Scholar] [CrossRef
[9] 刘乔寿, 赵志源, 王均成, 皮胜文. 高性能YOLOv5: 面向嵌入式平台高性能目标检测算法研究[J]. 电子与信息学报, 2023, 45(6): 2205-2215. [Google Scholar] [CrossRef
[10] 桂欣悦, 李振伟, 吴晨晨, 李彦玥. 基于MATLAB的红绿灯识别系统研究[J]. 电子设计工程, 2020, 28(16): 133-136. [Google Scholar] [CrossRef
[11] 李亚东, 马行, 穆春阳. 改进YOLOX网络的轴承缺陷小目标检测方法[J]. 计算机工程与应用, 2023, 59(1): 100-107. [Google Scholar] [CrossRef
[12] Ultralytics (2022) YOLOv5.
https://github.com/ultralytics/yolov5
[13] Huang, G., Liu, S., Van der Maaten, L. and Weinberger, K.Q. (2018) Con-denseNet: An Efficient Densenet Using Learned Group Convolutions. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 2752-2761. [Google Scholar] [CrossRef
[14] Wang, Q.L., Wu, B.G., Zhu, P.F., et al. (2020) ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 13-19 June 2020, 11531- 11539. [Google Scholar] [CrossRef
[15] Sandler, M., Howard, A., Zhu, M., et al. (2018) MobileNetV2: In-verted Residuals and Linear Bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4510-4520. [Google Scholar] [CrossRef
[16] Yan, B., Fan, P., Lei, X., et al. (2021) A Real-Time Apple Targets Detec-tion Method for Picking Robot Based on Improved YOLOv5. Remote Sensing, 13, Article 1619. [Google Scholar] [CrossRef
[17] Li, X., Wang, W.H., Hu, X.L. and Yang, J. (2019) Selectivekernel Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 510-519. [Google Scholar] [CrossRef
[18] Misra, D. (2019) Mish: A Self Regularized Non-Monotonic Activation Function. arXiv: 1908.08681.
[19] Tang, H., Liang, S., Yao, D. and Qiao, Y.J. (2023) A Visual Defect Detection for Optics Lens Based on the YOLOv5- C3CA-SPPF Network Model. Optics Express, 31, 2628-2643. [Google Scholar] [CrossRef
[20] Ghiasi, G., Lin, T.Y. and Le, Q.V. (2019) NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 7029-7038. [Google Scholar] [CrossRef
[21] Hu, J., Shen, L., Albanie, S., Sun, G. and Wu, E.H. (2020) Squeeze-and-Excitation Networks. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 42, 2011-2023. [Google Scholar] [CrossRef
[22] Chollet, F. (2017) Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hono-lulu, 21-26 July 2017, 1800-1807. [Google Scholar] [CrossRef
[23] Fu, J., Liu, J., Tian, H., et al. (2019) Dual Attention Network for Scene Segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 3141-3149. [Google Scholar] [CrossRef
[24] Gao, H., Wang, Z. and Ji, S. (2018) Large-Scale Learnable Graph Con-volutional Networks. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Min-ing, London, 19-23 August 2018, 1416- 1424. [Google Scholar] [CrossRef
[25] Wang, C.Y., Liao, H.Y.M., Wu, Y.H., et al. (2020) CSPNet: A New Backbone That Can Enhance Learning Capability of CNN. 2020 IEEE/CVF Confer-ence on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, 14- 19 June 2020, 1571-1580. [Google Scholar] [CrossRef
[26] Gao, Z., Xie, J., Wang, Q. and Li, P.H. (2019) Global Sec-ond-Order Pooling Convolutional Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 3019-3028. [Google Scholar] [CrossRef