基于YOLOX-s的改进研究
Research on Improvement Based on YOLOX-s
摘要: 随着深度学习的飞速发展,行为识别在计算机视觉中的应用也越来越多。作为行为识别的重要部分,目标检测算法的研究愈显突出。本文针对单阶段检测算法中效果较好的YOLOX-s模型,提出了基于YOLOX-s的注意力改进,在网络结构中,增加了SE模块和CBAM模块,将注意力集中在图片检测的目标区域中,使得更多的细节信息获取出来;同时选用DIoU损失函数,让聚集的目标框和检测框之间重叠部分更快地重合,提高了模型的精度。实验结果表明,所提的模型相比于YOLOX-s模型,在准确率、召回率以及mAP上有所提升,对满足实际需求更近一步。
Abstract: With the rapid development of deep learning, behavior recognition is more and more used in computer vision. As an important part of behavior recognition, the research of target detection algorithm is more and more prominent. Aiming at the YOLOX-s model with better effect in the single-stage detection algorithm, this paper proposes an attention improvement based on YOLOX-s. In the network structure, SE module and CBAM module are added to focus on the target area of image detection, so that more detailed information can be extracted; at the same time, DIoU loss function is selected to make the overlapping parts between the aggregated target frame and the detection frame coincide faster, which improves the accuracy of the model. The experimental results show that compared with the YOLOX-s model, the proposed model improves the accuracy, recall and map, and is a step closer to meeting the actual needs.
文章引用:谭林立, 邱志文, 豆世豪, 陈国瑞. 基于YOLOX-s的改进研究[J]. 计算机科学与应用, 2022, 12(8): 1904-1912. https://doi.org/10.12677/CSA.2022.128191

参考文献

[1] 申彤, 庄建军, 黎文斯, 王昀牧, 夏一飞, 张志俭, 张鑫, 杨继琼. 基于HOG特征提取和支持向量机的东巴文识别[J]. 南京大学学报(自然科学), 2020, 56(6): 870-876. [Google Scholar] [CrossRef
[2] 单宝彦, 朱振才, 张永合, 邱成波. 一种适用于行星表面特征提取的实时SIFT算法[J]. 激光与光电子学进展, 2021, 58(2): 211-218.
[3] 薄文嘉, 倪受东. 结合HOG与SVM的电子元件多位姿目标检测研究[J]. 机械设计与制造, 2021(10): 76-80. [Google Scholar] [CrossRef
[4] 王宇, 李延晖. 一种基于协同训练半监督的分类算法[J]. 华中师范大学学报(自然科学版), 2021, 55(6): 1020-1029. [Google Scholar] [CrossRef
[5] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587. https://arxiv.org/abs/1311.2524 [Google Scholar] [CrossRef
[6] Ren, S.Q., He, K.M., Girshick, R. and Sun, J. (2016) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
https://arxiv.org/abs/1506.01497
[7] 吴雪, 宋晓茹, 高嵩, 陈超波. 基于深度学习的目标检测算法综述[J]. 传感器与微系统, 2021, 40(2): 4-7+18.
[8] 张泽苗, 霍欢, 赵逢禹. 深层卷积神经网络的目标检测算法综述[J]. 小型微型计算机系统, 2019, 40(9): 1825-1831.
[9] 方路平, 何杭江, 周国民. 目标检测算法研究综述[J]. 计算机工程与应用, 2018, 54(13): 11-18+33.
[10] Rajeshwari, P., Abhishek, P., Srikanth, P. and Vinod, T. (2019) Object Detection: An Overview. Inter-national Journal of Trend in Scientific Research and Development, 3, 1663-1665. [Google Scholar] [CrossRef
[11] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recongnition, Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[12] Redmon, J. and Farhadi, A. (2017) YOLO9000: Better, Faster, Stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 6517-6525. [Google Scholar] [CrossRef
[13] Bochkovskiy, A., Wang, C.Y. and Liao, H.M. (2020) YOLOv4: Op-timal Speed and Accuracy of Object Detection.
https://arxiv.org/abs/2004.10934