基于改进的YOLOv8毫米波图像目标分割方法

doi:10.12677/csa.2026.166229

期刊菜单

基于改进的YOLOv8毫米波图像目标分割方法
A New Method for Millimeter-Wave Image Target Segmentation Using the Tuned YOLOv8

DOI: 10.12677/csa.2026.166229, PDF,
作者: 高晨凯, 钱子凡, 李文杰, 叶学义：杭州电子科技大学通信工程学院，浙江杭州
关键词: 毫米波图像检测；感受野坐标注意力卷积；自适应阈值；Millimeter-Wave Image Detection； Receptive Field Coordinate Attention Convolution；Adaptive Threshold

摘要: 现阶段的毫米波图像检测多依赖于检测框，未充分利用目标轮廓信息，限制了检测系统的性能。针对这一问题，本文提出一种基于改进的YOLOv8分割模型，通过轮廓信息提升检测精度，是对现有先进技术的一次成功的、针对特定应用场景的优化集成。首先，引入了感受野坐标注意力卷积模块强化目标轮廓特征提取。其次，通过分析毫米波图像中小目标为主的分布特性，新增一层专门用于增强小目标检测能力的检测层。最后，采用自适应阈值的标签分配解决小目标IoU敏感的问题，进一步确保模型对小目标轮廓的精准识别。实验结果表明，该方法在box mAP50上提升了2.6%，在mask mAP50上提升了10.4%，在毫米波图像检测中精度优势显著。

Abstract: At present, millimeter-wave image detection mostly relies on detection frames and fails to make full use of the target contour information, which limits the performance of the detection system. To address this issue, this paper proposes an improved YOLOv8 segmentation model to enhance detection accuracy through contour information. Firstly, the Receptive Field Coordinate Attention Convolution module was introduced to enhance the extraction of target contour features. Secondly, by analyzing the distribution characteristics dominated by small targets in millimeter-wave images, a new detection layer specifically designed to enhance the detection capability of small targets is added. Finally, adaptive threshold label allocation is adopted to address the issue of IoU sensitivity of small targets, further ensuring the model’s accurate recognition of the contours of small targets. The experimental results show that this method improves by 2.6% on box mAP50 and by 10.4% on mask mAP50, and has a significant advantage in accuracy in millimeter-wave image detection.

文章引用：高晨凯, 钱子凡, 李文杰, 叶学义. 基于改进的YOLOv8毫米波图像目标分割方法[J]. 计算机科学与应用, 2026, 16(6): 300-312. https://doi.org/10.12677/csa.2026.166229

参考文献

[1]	Du, K., Wang, W., Nian, F., et al. (2016) Concealed Obiects Detection in Active Millimeter-Wave Images. Systems Engineering and Electronics, 38, 1462-1469. (In Chinese).
[2]	Haworth, C.D., De Saint-Pern, Y., Clark, D., Trucco, E. and Petillot, Y.R. (2006) Detection and Tracking of Multiple Metallic Objects in Millimetre-Wave Images. International Journal of Computer Vision, 71, 183-196. [Google Scholar] [CrossRef]
[3]	Ren, S.Q., He, K.M., Girshick, R.B., et al. (2015) Faster R-CNN: Towards Realtime Object Detection with Region Proposal Networks. Proceedings of the International Conference on Neural Information Processing System, Istanbul, Turkey, 9-12 November 2015, 91-99.
[4]	Liu, T., Zhao, Y., Wei, Y., Zhao, Y. and Wei, S. (2019) Concealed Object Detection for Activate Millimeter Wave Image. IEEE Transactions on Industrial Electronics, 66, 9909-9917. [Google Scholar] [CrossRef]
[5]	Pang, L., Liu, H., Chen, Y. and Miao, J. (2020) Real-Time Concealed Object Detection from Passive Millimeter Wave Images Based on the Yolov3 Algorithm. Sensors, 20, Article 1678. [Google Scholar] [CrossRef] [PubMed]
[6]	Su, Y., Tan, W., Dong, Y., Xu, W., Huang, P., Zhang, J., et al. (2024) Enhancing Concealed Object Detection in Active Millimeter Wave Images Using Wavelet Transform. Signal Processing, 216, Article ID: 109303. [Google Scholar] [CrossRef]
[7]	Jocher, G., Chaurasia, A. and Qiu, J. (2023) YOLO by Ultralytics. https://github.com/ultralytics/ultralytics
[8]	Zhang, X., Liu, C., Yang, D., Song, T., Ye, Y., Li, K. and Song, Y. (2023) RFAConv: Innovating Spatial Attention and Standard Convolutional Operation. arxiv: 2304.03198.
[9]	Zhang, S., Chi, C., Yao, Y., Lei, Z. and Li, S.Z. (2020) Bridging the Gap between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 9756-9765. [Google Scholar] [CrossRef]
[10]	Jocher, G., et al. (2020) Yolov5. https://github.com/ultralytics/yolov5
[11]	Wang, C., Mark Liao, H., Wu, Y., Chen, P., Hsieh, J. and Yeh, I. (2020) CSPNet: A New Backbone That Can Enhance Learning Capability of CNN. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, 14-19 June 2020, 1571-1580. [Google Scholar] [CrossRef]
[12]	Elfwing, S., Uchibe, E. and Doya, K. (2018) Sigmoid-weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. Neural Networks, 107, 3-11. [Google Scholar] [CrossRef] [PubMed]
[13]	Liu, S., Qi, L., Qin, H., Shi, J. and Jia, J. (2018) Path Aggregation Network for Instance Segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8759-8768. [Google Scholar] [CrossRef]
[14]	Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B. and Belongie, S. (2017) Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef]
[15]	Cheng, G., Wang, J., Li, K., Xie, X., Lang, C., Yao, Y., et al. (2022) Anchor-Free Oriented Proposal Generator for Object Detection. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-11. [Google Scholar] [CrossRef]
[16]	Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., et al. (2022) Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. IEEE Transactions on Cybernetics, 52, 8574-8586. [Google Scholar] [CrossRef] [PubMed]
[17]	Li, X., Wang, W., Wu, L., et al. (2020) Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection. Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, 6-12 December 2020, 21002-21012.
[18]	Mao, A., Mohri, M. and Zhong, Y. (2023) Cross-Entropy Loss Functions: Theoretical Analysis and Applications. Proceedings of the 40th International Conference on Machine Learning, Honolulu, 23-29 July 2023, 23803-23828.
[19]	Micikevicius, P., Narang, S., Alben, J., et al. (2017) Mixed Precision Training. arXiv: 1710.03740.
[20]	Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., et al. (2023) Segment Anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 3992-4003. [Google Scholar] [CrossRef]
[21]	He, K., Gkioxari, G., Dollar, P. and Girshick, R. (2017) Mask R-CNN. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2980-2988. [Google Scholar] [CrossRef]
[22]	Cai, Z. and Vasconcelos, N. (2018) Cascade R-CNN: Delving into High Quality Object Detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 6154-6162. [Google Scholar] [CrossRef]
[23]	Fang, Y., Yang, S., Wang, X., Li, Y., Fang, C., Shan, Y., et al. (2021) Instances as Queries. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 6890-6899. [Google Scholar] [CrossRef]
[24]	Wang, C., Yeh, I. and Mark Liao, H. (2024) YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv: 2402.13616.
[25]	Liang, D., Xue, F. and Li, L. (2021) Active Terahertz Imaging Dataset for Concealed Object Detection. arXiv: 2105.03677.

为你推荐

友情链接