基于改进RT-DETR的空海无人机目标检测模型研究
Research on Air-Sea UAV Target Detection Model Based on Improved RT-DETR
摘要: 针对空海无人机目标检测中目标尺度变化大、动态背景干扰强,以及传统模型泛化能力不足、多尺度特征处理效率低的问题,本文提出一种融合频域解耦与视觉Transformer的RT-DETR改进模型(FD-ViT-RT-DETR)。该模型以RT-DETR为基础框架,前端设计频域解耦模块,通过快速傅里叶变换分离图像域不变特征(目标核心特征)与域特定特征(环境干扰特征),并结合实例级对比损失强化特征区分度;中间优化视觉Transformer架构,新增空间压缩网络降低30%计算成本以适配边缘设备,同时引入位置注意力偏置提升长距离依赖捕捉能力;后端搭配不确定性最小查询选择方案优化检测精度。在含508张空海无人机图像的数据集上实验,结果显示:该模型精度(P)达86.4%、召回率(R)达87.4%、mAP50达88.1%,均优于YOLOv5、YOLOv8及原始RT-DETR,且在强逆光、海浪反光等复杂场景中鲁棒性更强,可为空海搜救、安防监控等实际应用提供高效检测方案。
Abstract: Aiming at the problems in air-sea UAV target detection, such as large target scale variation, strong dynamic background interference, insufficient generalization ability of traditional models, and low efficiency in multi-scale feature processing, this paper proposes an improved RT-DETR model integrating frequency domain decoupling and Vision Transformer (FD-ViT-RT-DETR). Based on the RT-DETR framework, the model is designed with a frequency domain decoupling module at the front end, which separates image domain-invariant features (core target features) from domain-specific features (environmental interference features) through fast Fourier transform, and enhances feature distinguishability by combining instance-level contrastive loss. In the middle layer, the Vision Transformer architecture is optimized by adding a spatial compression network to reduce 30% of computational costs for adapting to edge devices, while introducing positional attention bias to improve the ability of capturing long-range dependencies. At the back end, an uncertainty-minimized query selection scheme is adopted to optimize detection accuracy. Experiments on a dataset containing 508 air-sea UAV images show that the model achieves a precision (P) of 86.4%, a recall (R) of 87.4%, and an mAP50 of 88.1%, all outperforming YOLOv5, YOLOv8, and the original RT-DETR. It also exhibits stronger robustness in complex scenarios such as strong backlight and sea wave reflection, providing an efficient detection solution for practical applications like air-sea search and rescue, and security monitoring.
文章引用:赵嘉懿, 肖馨悦, 董欣乐, 李自新. 基于改进RT-DETR的空海无人机目标检测模型研究[J]. 计算机科学与应用, 2025, 15(11): 247-256. https://doi.org/10.12677/csa.2025.1511301

参考文献

[1] 尹秋燕, 丁婧, 聂志刚. YOLO-AirPose: 无人机航拍视角下的人体姿态估计算法[J/OL]. 计算机应用, 1-10.
https://link.cnki.net/urlid/51.1307.TP.20250929.1700.014, 2025-10-19.
[2] 邱略能, 郑志祥, 孟德威. 基于改进YOLO的高压配电网污闪绝缘子无人机检测方法[J]. 计算技术与自动化, 2025, 44(3): 123-127.
[3] 童浩然, 金涵. 面向无人机航拍图像的端到端检测算法[J]. 现代电子技术, 2025, 48(19): 103-109.
[4] 陈星星, 李生林, 周香伶. 改进RT-DETR的水下目标检测算法[J/OL]. 激光与光电子学进展, 1-15.
https://link.cnki.net/urlid/31.1690.TN.20250923.1413.108, 2025-10-19.
[5] 张学锋, 秦继洋, 龙红明, 等. 基于改进RTDETR的边缘多尺度焦粉检测[J/OL]. 钢铁, 1-14. 2025-10-19.[CrossRef
[6] 王泽玄, 雷雪梅. 基于RT-DETR的轻量化交通标志检测算法[J]. 现代电子技术, 2025, 48(18): 57-64.
[7] 刘臣杰, 刘巍, 杨雯迪, 等. DEPA-YOLO: 无人机视角下的小目标检测模型[J/OL]. 计算机科学与探索, 1-16.
https://kns.cnki.net/kcms2/article/abstract?v=VUvWpoE9A3ITi70SHDHk8XxLyu6S4ylaGPE8nLnY-XZX064l4mP2hQs-ao09cZfB4PtFrEZOF0Ym4MMcsNO4QeA-trHO5OA0jlXtzP0sLLJJ4tqzjM9evBpraBqEnY0HyWDCmqxhSwNHNjv0NMYUhQ-fyO-ccax_E0PQmZOOsN6FXtVIjJk2T0q1TtpxKasFidKs9U5zxHw=&uniplatform=NZKPT, 2025-10-19.
[8] 何懿璇, 叶兆元, 郑凯扬, 等. 基于ViT的农作物检测方法与应用研究[J]. 南方农机, 2025, 56(15): 55-58.
[9] 祝欣宇, 窦迅, 牛鹏艺, 等. 基于改进ViT模型的电网关键线路智能预测方法[J/OL]. 电力自动化设备, 1-15.
https://kns.cnki.net/kcms2/article/abstract?v=VUvWpoE9A3L2kQe65OSkbPvCEOCb0r5-WTWy6gyLRd5DzLwCZ2p_pF_p3B6LUaJX78F3pc1cJa-cQE_BIVm80FETjc54v6itVR3x62jCVySIF3jKLminXKDuJ4km4hXfaWvisg1rl5yKnRAjRp6pA6q0de4-feP2oeEHwCJEQKAoSvSoPUM_9RPQH3dQ3xPoB5HLoggB_2o=&uniplatform=NZKPT, 2025-10-19.
[10] 孙杰. 基于时频域解耦与双向注意力机制的显著性检测算法研究与应用[D]: [硕士学位论文]. 杭州: 浙江大学, 2023.