基于DMS-YOLO的水下模糊目标检测算法
DMS-YOLO: An Underwater Blurred Object Detection Algorithm
DOI: 10.12677/csa.2026.164140, PDF,   
作者: 霍瑞峰, 郑广海*:大连交通大学轨道智能工程学院,辽宁 大连
关键词: 水下目标检测YOLOv11DySampleDyHeadMSDASlide LossUnderwater Object Detection YOLOv11 DySample DyHead MSDA Slide Loss
摘要: 针对水下目标模糊、边缘对比度低导致检测精度下降的问题,本文提出了一种改进的水下目标检测模型DMS-YOLO。该方法首先引入动态上采样模块(Dynamic Sampling, DySample),通过自适应调整采样位置实现特征重建,以改善细节信息的恢复效果;其次构建动态检测头(Dynamic Head, DyHead),从尺度、空间及任务三个维度提升特征表达的自适应性;进一步在C2PSA结构中嵌入多尺度膨胀注意力机制(Multi-Scale Dilated Attention, MSDA),以增强多尺度信息融合能力;最后采用Slide Loss损失函数,通过动态加权策略提升模型对分类困难样本的学习能力。在RUOD数据集上的实验结果表明,DMS-YOLO的mAP@0.5达到87.5%,mAP@0.5:0.95为64.0%,该方法对模糊目标细节表达及多尺度特征融合方面表现出一定改进,并在检测精度与计算复杂度之间呈现出较为合理的平衡。
Abstract: To address the decline in detection accuracy caused by blurred underwater targets and low edge contrast, this paper proposes an improved underwater object detection model, DMS-YOLO. The method first introduces a dynamic upsampling module (Dynamic Sampling, DySample), which reconstructs features by adaptively adjusting sampling positions to improve the recovery of fine-grained details. Next, a dynamic detection head (Dynamic Head, DyHead) is constructed to enhance the adaptability of feature representation from three dimensions: scale, spatial, and task. Furthermore, a multi-scale dilated attention mechanism (Multi-Scale Dilated Attention, MSDA) is embedded into the C2PSA structure to strengthen multi-scale feature fusion. Finally, the Slide Loss function is adopted, which improves the model’s learning capability for difficult classification samples through a dynamic weighting strategy. Experimental results on the RUOD dataset show that DMS-YOLO achieves an mAP@0.5 of 87.5% and an mAP@0.5:0.95 of 64.0%. The results indicate that the proposed method shows improvements in the representation of blurred target details and multi-scale feature fusion, while maintaining a relatively balanced trade-off between detection accuracy and computational complexity.
文章引用:霍瑞峰, 郑广海. 基于DMS-YOLO的水下模糊目标检测算法[J]. 计算机科学与应用, 2026, 16(4): 406-418. https://doi.org/10.12677/csa.2026.164140

参考文献

[1] Xu, S., Zhang, M., Song, W., Mei, H., He, Q. and Liotta, A. (2023) A Systematic Review and Analysis of Deep Learning-Based Underwater Object Detection. Neurocomputing, 527, 204-232. [Google Scholar] [CrossRef
[2] Fan, Y., Zhang, L. and Li, P. (2024) A Lightweight Model of Underwater Object Detection Based on YOLOv8n for an Edge Computing Platform. Journal of Marine Science and Engineering, 12, Article 697. [Google Scholar] [CrossRef
[3] Wang, X., Gao, H., Jia, Z. and Li, Z. (2023) Bl-YOLOv8: An Improved Road Defect Detection Model Based on YOLOv8. Sensors, 23, Article 8361. [Google Scholar] [CrossRef] [PubMed]
[4] Liu, Y., Huang, Z., Song, Q. and Bai, K. (2025) PV-YOLO: A Lightweight Pedestrian and Vehicle Detection Model Based on Improved YOLOv8. Digital Signal Processing, 156, Article 104857. [Google Scholar] [CrossRef
[5] Girshick, R. (2015) Fast R-CNN. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 1440-1448. [Google Scholar] [CrossRef
[6] Ren, S., He, K., Girshick, R. and Sun, J. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef] [PubMed]
[7] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., et al. (2016) SSD: Single Shot MultiBox Detector. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Lecture Notes in Computer Science, Springer International Publishing, 21-37. [Google Scholar] [CrossRef
[8] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[9] Liu, Y., An, D., Ren, Y., Zhao, J., Zhang, C., Cheng, J., et al. (2024) DP-FishNet: Dual-Path Pyramid Vision Transformer-Based Underwater Fish Detection Network. Expert Systems with Applications, 238, Article 122018. [Google Scholar] [CrossRef
[10] Liu, K., Sun, Q., Sun, D., Peng, L., Yang, M. and Wang, N. (2023) Underwater Target Detection Based on Improved YOLOv7. Journal of Marine Science and Engineering, 11, Article 677. [Google Scholar] [CrossRef
[11] Qu, S., Cui, C., Duan, J., Lu, Y. and Pang, Z. (2024) Underwater Small Target Detection under YOLOv8-LA Model. Scientific Reports, 14, Article No. 16108. [Google Scholar] [CrossRef] [PubMed]
[12] Liu, W., Lu, H., Fu, H. and Cao, Z. (2023) Learning to Upsample by Learning to Sample. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 6027-6037. [Google Scholar] [CrossRef
[13] Dai, X., Chen, Y., Xiao, B., Chen, D., Liu, M., Yuan, L., et al. (2021) Dynamic Head: Unifying Object Detection Heads with Attentions. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 7373-7382. [Google Scholar] [CrossRef
[14] Jiao, J., Tang, Y., Lin, K., Gao, Y., Ma, A.J., Wang, Y., et al. (2023) DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition. IEEE Transactions on Multimedia, 25, 8906-8919. [Google Scholar] [CrossRef
[15] Yu, Z., Huang, H., Chen, W., Su, Y., Liu, Y. and Wang, X. (2024) YOLO-FaceV2: A Scale and Occlusion Aware Face Detector. Pattern Recognition, 155, Article 110714. [Google Scholar] [CrossRef
[16] Fu, C., Liu, R., Fan, X., Chen, P., Fu, H., Yuan, W., et al. (2023) Rethinking General Underwater Object Detection: Datasets, Challenges, and Solutions. Neurocomputing, 517, 243-256. [Google Scholar] [CrossRef
[17] Chen, H., Chen, K., Ding, G., Han, J., Lin, Z., Liu, L., et al. (2024) YOLOv10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984-108011. [Google Scholar] [CrossRef
[18] Khanam, R. and Hussain, M. (2024) YOLOv11: An Overview of the Key Architectural Enhancements. arXiv:2410.17725.
[19] Tian, Y., Ye, Q. and Doermann, D. (2025) YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv:2502.12524.
[20] Zhao, Y., Lv, W., Xu, S., Wei, J., Wang, G., Dang, Q., et al. (2024) DETRs Beat YOLOs on Real-Time Object Detection. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 16965-16974. [Google Scholar] [CrossRef