YOLO-GW:更轻量更精确的YOLOv7模型
YOLO-GW: A Lighter and More Precise YOLOv7 Model
DOI: 10.12677/csa.2024.1412239, PDF,   
作者: 祝 发:天津工业大学计算机科学与技术学院,天津;苑春苗:天津工业大学软件学院,天津;杨清永*:天津中德应用技术大学软件与通信学院,天津
关键词: 目标检测YOLOv7GSConvWIoU特征融合Object Detection YOLOv7 GSConv WIoU Feature Fusion
摘要: YOLOv7是一种目标检测算法,但其在一些资源受限的设备上可能面临计算和内存压力。为了解决该问题,本文提出了使用GSConv对YOLOv7模型进行改进,在减轻复杂度的同时,添加P2检测层和WIoU损失函数增加模型的检测性能。为了验证我们提出的方法的有效性,我们进行了一系列实验和比较。我们选取了标准的YOLOv7模型作为基准,并使用PASCAL VOC2007公开数据集进行训练和测试。实验结果表明,相较于标准模型,本文提出的模型可以显著减少参数量,同时保持较高的检测精度,为在资源受限的设备上部署和运行YOLOv7模型提供了可行的解决方案。
Abstract: YOLOv7 is an object detection algorithm, but it may encounter computational and memory pressures on certain resource-constrained devices. To tackle this challenge, this paper proposes enhancing the YOLOv7 model using GSConv, incorporating a P2 detection layer and a WIoU loss function to bolster the model’s detection capabilities while minimizing complexity. To validate the efficacy of our approach, we conducted a series of experiments and comparisons. We adopted the standard YOLOv7 model as the baseline and trained and tested it using the PASCAL VOC2007 public dataset. The experimental results demonstrate that, compared to the standard model, our proposed model can significantly reduce the number of parameters while maintaining high detection accuracy, offering a viable solution for deploying and running the YOLOv7 model on resource-limited devices.
文章引用:祝发, 苑春苗, 杨清永. YOLO-GW:更轻量更精确的YOLOv7模型[J]. 计算机科学与应用, 2024, 14(12): 44-53. https://doi.org/10.12677/csa.2024.1412239

参考文献

[1] Wang, C., Bochkovskiy, A. and Liao, H.M. (2023) YOLOv7: Trainable Bag-Of-Freebies Sets New State-Of-The-Art for Real-Time Object Detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 7464-7475. [Google Scholar] [CrossRef
[2] Li, H., Li, J., Wei, H., et al. (2022) Slim-Neck by GSConv: A Better Design Paradigm of Detector Architectures for Autonomous Vehicles. arXiv: 2206.02424.
[3] Arthur, D. and Vassilvitskii, S. (2006) k-Means++: The Advantages of Careful Seeding. Stanford CS Theory.
[4] Tong, Z., Chen, Y., Xu, Z., et al. (2023) Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism. arXiv: 2301.10051.
[5] Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I. and Savarese, S. (2019) Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 658-666. [Google Scholar] [CrossRef
[6] Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R. and Ren, D. (2020) Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000. [Google Scholar] [CrossRef
[7] Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., et al. (2022) Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. IEEE Transactions on Cybernetics, 52, 8574-8586. [Google Scholar] [CrossRef] [PubMed]
[8] Zhang, Y., Ren, W., Zhang, Z., Jia, Z., Wang, L. and Tan, T. (2022) Focal and Efficient IoU Loss for Accurate Bounding Box Regression. Neurocomputing, 506, 146-157. [Google Scholar] [CrossRef
[9] Ren, S., He, K., Girshick, R. and Sun, J. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef] [PubMed]
[10] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., et al. (2016) SSD: Single Shot Multibox Detector. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Computer VisionECCV 2016, Springer, 21-37. [Google Scholar] [CrossRef
[11] Redmon, J. (2018) YOLOv3: An Incremental Improvement. arXiv: 1804.02767.
[12] Ge, Z. (2021) YOLOx: Exceeding Yolo Series in 2021. arXiv: 2107.08430.
[13] Zhu, X., Lyu, S., Wang, X. and Zhao, Q. (2021) TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, 11-17 October 2021, 2778-2788. [Google Scholar] [CrossRef
[14] Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L., Tan, M., et al. (2019) Searching for MobileNetV3. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 1314-1324. [Google Scholar] [CrossRef
[15] Zhang, X., Zhou, X., Lin, M. and Sun, J. (2018) ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 6848-6856. [Google Scholar] [CrossRef
[16] Tan, M. and Le, Q. (2021) EfficientNetV2: Smaller Models and Faster Training. arXiv: 2104.00298.