YOLO-Vortex:基于漩涡聚合网络的水下目标检测模型
YOLO-Vortex: Underwater Target Detection Model Based on Vortex Aggregation Network
DOI: 10.12677/csa.2025.152033, PDF,   
作者: 常鲁宁, 苑春苗:天津工业大学软件学院,天津;杨清永*:天津中德应用技术大学软件与通信学院,天津
关键词: 水下目标检测YOLOv7漩涡聚合网络噪声特征扰乱Underwater Target Detection YOLOv7 Vortex Aggregation Network Noise Characteristic Disturbance
摘要: 水下目标检测在海洋探索、生态保护和水下机器人导航等领域具有重要应用。然而,由于水下环境的复杂性,如光照不均匀、悬浮颗粒干扰和低对比度图像,传统的目标检测方法在水下环境中的表现往往不尽如人意,尤其是面对数据中的噪声问题。为了解决这一问题,本研究提出了一种基于YOLOv7的改进模型用于水下目标检测。我们将YOLOv7作为基线模型,针对其在水下环境中的不足之处,对模型的关键模块进行了优化。具体而言,我们提出了一种漩涡聚合网络模块来破坏噪声数据,并在此过程前引入了空间注意力机制,帮助网络更好地关注重要特征,并抑制不相关的噪声;针对下采样过程中可能存在的信息丢失问题,我们提出了空间到深度池化模块(STD-MP),通过将空间特征转换为深度特征,结合最大池化操作完成下采样过程;最后,我们对损失函数进行了优化。实验结果表明,我们的模型相比于基准模型提升了4.2%的mAP。
Abstract: Underwater object detection has important applications in fields such as ocean exploration, ecological protection, and underwater robotics navigation. However, due to the complexity of the underwater environment, including uneven lighting, interference from suspended particles, and low-contrast images, traditional object detection methods often perform suboptimally in underwater scenarios, particularly when dealing with noisy data. To address this issue, this study proposes an improved model based on YOLOv7 for underwater object detection. We use YOLOv7 as the baseline model and optimize its key modules to overcome its limitations in underwater environments. Specifically, we introduce a vortex aggregation network module to disrupt noisy data, incorporating a spatial attention mechanism before this process to help the network better focus on important features and suppress irrelevant noise. To tackle the issue of potential information loss during downsampling, we propose the Space-To-Depth Pooling (STD-MP) module, which converts spatial features into depth features and combines them with max pooling for downsampling. Finally, we optimize the loss function. Experimental results show that our model achieves a 4.2% improvement in mAP compared to the baseline model.
文章引用:常鲁宁, 苑春苗, 杨清永. YOLO-Vortex:基于漩涡聚合网络的水下目标检测模型[J]. 计算机科学与应用, 2025, 15(2): 57-70. https://doi.org/10.12677/csa.2025.152033

参考文献

[1] Wang, C., Bochkovskiy, A. and Liao, H.M. (2023) YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 7464-7475. [Google Scholar] [CrossRef
[2] Lin, W., Zhong, J., Liu, S., Li, T. and Li, G. (2020) ROIMIX: Proposal-Fusion among Multiple Images for Underwater Object Detection. ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, 4-8 May 2020, 2588-2592. [Google Scholar] [CrossRef
[3] Fan, B., Chen, W., Cong, Y. and Tian, J. (2020) Dual Refinement Underwater Object Detection Network. Computer Vision—ECCV 2020, Glasgow, 23-28 August 2020, 275-291. [Google Scholar] [CrossRef
[4] Chen, L., Zhou, F., Wang, S., et al. (2020) SWIPENET: Object Detection in Noisy Underwater Images. arXiv: 2010.10006. [Google Scholar] [CrossRef
[5] Chang, D. (2021) CDNet Is All You Need: Cascade DCN Based Underwater Object Detection RCNN. arXiv: 2111.12982. [Google Scholar] [CrossRef
[6] Li, X., Li, F., Yu, J., et al. (2022) A High-Precision Underwater Object Detection Based on Joint Self-Supervised Deblurring and Improved Spatial Transformer Network. arXiv: 2203.04822. [Google Scholar] [CrossRef
[7] Song, P., Li, P., Dai, L., Wang, T. and Chen, Z. (2023) Boosting R-CNN: Reweighting R-CNN Samples by RPN’s Error for Underwater Object Detection. Neurocomputing, 530, 150-164. [Google Scholar] [CrossRef
[8] Jain, S. (2024) DeepSeaNet: Improving Underwater Object Detection Using EfficientDet. 2024 4th International Conference on Applied Artificial Intelligence (ICAPAI), Halden, 16 April 2024, 1-11. [Google Scholar] [CrossRef
[9] Walia, J.S. and Seemakurthy, K. (2023) Optimized Custom Dataset for Efficient Detection of Underwater Trash. Towards Autonomous Robotic Systems, Cambridge, 13-15 September 2023, 292-303. [Google Scholar] [CrossRef
[10] Dai, L., Liu, H., Song, P. and Liu, M. (2024) A Gated Cross-Domain Collaborative Network for Underwater Object Detection. Pattern Recognition, 149, Article 110222. [Google Scholar] [CrossRef
[11] Dai, L., Liu, H., Song, P., et al. (2023) Edge-Guided Representation Learning for Underwater Object Detection. arXiv: 2306.00440. [Google Scholar] [CrossRef
[12] Zhou, J., He, Z., Lam, K., Wang, Y., Zhang, W., Guo, C., et al. (2024) AMSP-UOD: When Vortex Convolution and Stochastic Perturbation Meet Underwater Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 7659-7667. [Google Scholar] [CrossRef
[13] Fu, C., Fan, X., Xiao, J., Yuan, W., Liu, R. and Luo, Z. (2023) Learning Heavily-Degraded Prior for Underwater Object Detection. IEEE Transactions on Circuits and Systems for Video Technology, 33, 6887-6896. [Google Scholar] [CrossRef
[14] Liu, Z., Wang, B., Li, Y., He, J. and Li, Y. (2024) UnitModule: A Lightweight Joint Image Enhancement Module for Underwater Object Detection. Pattern Recognition, 151, Article 110435. [Google Scholar] [CrossRef
[15] Ren, S., He, K., Girshick, R. and Sun, J. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef] [PubMed]
[16] Sunkara, R. and Luo, T. (2023) No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects. Machine Learning and Knowledge Discovery in Databases, Grenoble, 19-23 September 2022, 443-459. [Google Scholar] [CrossRef
[17] Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R. and Ren, D. (2020) Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000. [Google Scholar] [CrossRef
[18] Tong, Z., Chen, Y., Xu, Z., et al. (2023) Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism. arXiv: 2301.10051. [Google Scholar] [CrossRef
[19] Hong, J., Fulton, M. and Sattar, J. (2020) Trashcan: A Semantically-Segmented Dataset towards Visual Detection of Marine Debris. arXiv: 2007.08097. [Google Scholar] [CrossRef
[20] Japan Agency for Marine Earth Science and Technology (2018) Deep-Sea Debris Database.
http://www.godac.jamstec.go.jp/catalog/dsdebris/e/index.html
[21] Li, C., Li, L., Jiang, H., et al. (2022) YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv: 2209.02976. [Google Scholar] [CrossRef
[22] Wang, C., Yeh, I. and Mark Liao, H. (2024) YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. Computer Vision—ECCV 2024, Milan, 29 September-4 October 2024, 1-21. [Google Scholar] [CrossRef
[23] Wang, A., Chen, H., Liu, L., et al. (2024) YOLOv10: Real-Time End-to-End Object Detection. arXiv: 2405.14458. [Google Scholar] [CrossRef