基于改进YOLOv11s的无人机航拍图像小目标检测模型
A Small Object Detection Model for UAV Aerial Images Based on an Improved YOLOv11s Approach
DOI: 10.12677/csa.2026.162081, PDF,   
作者: 李云龙:江西理工大学理学院,江西 赣州
关键词: 无人机YOLOv11s小目标检测检测精度UAV YOLOv11s Small Object Detection Detection Accuracy
摘要: 无人机在巡查、农业监测等工程应用中具有重要的工程价值,但是无人机航拍图像目标尺寸小,分布密集、数量多,实际检测工作难度大,因此提高无人机航拍图像的小目标检测性能是当前重要而迫切的技术问题。本文提出了一种基于改进YOLOv11s的小目标检测模型,命名为RB-YOLOv11s。首先,设计了重参数化幽灵跨阶段高效聚合网络(RepGhostCSPELAN Net,简称RGNet),该网络能够增强模型的表征能力,并且整合多层级特征,降低模型参数量和计算量。此外,以BiFPN-GLSA网络替换原有的路径聚合网络(PANet),使得主干网络和颈部网络产生的特征层相互融合,增强模型对全局及局部空间信息的感知能力。在VisDrone2019无人机航拍图像数据集上的实验表明,RB-YOLOv11s模型展现出卓越性能,与原模型YOLOv11s相比,RB-YOLOv11s模型的计算量虽然小幅增加,但参数量下降了25.5%,检测精度提高了1.9%。本文模型有效解决了无人机航拍图像中小目标检测精度低的问题。
Abstract: Unmanned aerial vehicles (UAVs) hold significant engineering value in applications such as patrols and agricultural monitoring. However, UAV aerial images feature small target dimensions, dense distributions, and large quantities, making actual detection challenging. Therefore, improving small object detection performance in UAV aerial images is a critical and urgent technical issue. This paper proposes a small object detection model based on an improved YOLOv11s architecture, named RB-YOLOv11s. First, we designed the Reparameterized Ghost Cross-Stage Efficient Polymerization Network (RepGhostCSPELAN Net, abbreviated as RGNet). This network enhances the model’s representational capacity while integrating multi-level features, thereby reducing both the number of model parameters and computational requirements. Additionally, the original Path Aggregation Network (PANet) is replaced with the BiFPN-GLSA network. This enables the fusion of feature layers generated by the backbone network and the neck network, enhancing the model’s perception of both global and local spatial information. Experiments on the VisDrone2019 UAV aerial image dataset demonstrate that the RB-YOLOv11s model exhibits outstanding performance. Compared to the original YOLOv11s model, RB-YOLOv11s exhibits a slight increase in computational complexity but achieves a 25.5% reduction in parameters while improving detection accuracy by 1.9%. This model effectively addresses the issue of low detection accuracy for small objects in aerial drone imagery.
文章引用:李云龙. 基于改进YOLOv11s的无人机航拍图像小目标检测模型[J]. 计算机科学与应用, 2026, 16(2): 522-531. https://doi.org/10.12677/csa.2026.162081

参考文献

[1] Minaeian, S., Liu, J. and Son, Y.J. (2015) Vision-Based Target Detection and Localization via a Team of Cooperative UAV and UGVs. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 46, 1005-1016. [Google Scholar] [CrossRef
[2] Biehle, T. (2025) Urban Dimension of U-Space: Local Planning Considerations for Drone Integration. Drones, 9, Article 744. [Google Scholar] [CrossRef
[3] Bisio, I., Garibotto, C., Haleem, H., Lavagetto, F. and Sciarrone, A. (2022) A Systematic Review of Drone Based Road Traffic Monitoring System. IEEE Access, 10, 101537-101555. [Google Scholar] [CrossRef
[4] Hama, A., Tanaka, K., Chen, B. and Kondoh, A. (2021) Examination of Appropriate Observation Time and Correction of Vegetation Index for Drone-Based Crop Monitoring. Journal of Agricultural Meteorology, 77, 200-209. [Google Scholar] [CrossRef
[5] Kucharczyk, M. and Hugenholtz, C.H. (2021) Remote Sensing of Natural Hazard-Related Disasters with Small Drones: Global Trends, Biases, and Research Opportunities. Remote Sensing of Environment, 264, Article ID: 112577. [Google Scholar] [CrossRef
[6] Cazzato, D., Cimarelli, C., Sanchez-Lopez, J.L., Voos, H. and Leo, M. (2020) A Survey of Computer Vision Methods for 2D Object Detection from Unmanned Aerial Vehicles. Journal of Imaging, 6, Article 78. [Google Scholar] [CrossRef] [PubMed]
[7] Ahmed, S.F., Alam, M.S.B., Hassan, M., Rozbu, M.R., Ishtiak, T., Rafa, N., et al. (2023) Deep Learning Modelling Techniques: Current Progress, Applications, Advantages, and Challenges. Artificial Intelligence Review, 56, 13521-13617. [Google Scholar] [CrossRef
[8] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef
[9] Ren, S., He, K., Girshick, R. and Sun, J. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef] [PubMed]
[10] He, K., Gkioxari, G., Dollar, P. and Girshick, R. (2017) Mask R-CNN. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2980-2988. [Google Scholar] [CrossRef
[11] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[12] Redmon, J. and Farhadi, A. (2017) YOLO9000: Better, Faster, Stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 6517-6525. [Google Scholar] [CrossRef
[13] Redmon, J. and Farhadi, A. (2018) YOLOv3: An Incremental Improvement. arXiv: 1804.02767.
[14] Bochkovskiy, A., Wang, C.Y. and Liao, H.Y.M. (2020) YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv: 2004.10934.
[15] Zhang, Y., Guo, Z., Wu, J., Tian, Y., Tang, H. and Guo, X. (2022) Real-Time Vehicle Detection Based on Improved YOLO V5. Sustainability, 14, Article 12274. [Google Scholar] [CrossRef
[16] Li, C., Li, L., Jiang, H., et al. (2022) YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv: 2209.02976.
[17] Wang, C., Bochkovskiy, A. and Liao, H.M. (2023) YOLOv7: Trainable Bag-Of-Freebies Sets New State-Of-The-Art for Real-Time Object Detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 7464-7475. [Google Scholar] [CrossRef
[18] Khanam, R. and Hussain, M. (2024) YOLOv11: An Overview of the Key Architectural Enhancements. arXiv: 2410.17725.
[19] Liu, Y., Zhang, H., Wen, F., Wang, X., Wang, L. and Cheng, X. (2025) Object Detection in Remote Sensing Images Based on Knowledge Graph and Visual Attention. 2025 International Joint Conference on Neural Networks (IJCNN), Rome, 30 June-5 July 2025, 1-8. [Google Scholar] [CrossRef
[20] Gu, Z., Zhu, K. and You, S. (2023) YOLO-SSFS: A Method Combining SPD-Conv/STDL/IM-FPN/SIoU for Outdoor Small Target Vehicle Detection. Electronics, 12, Article 3744. [Google Scholar] [CrossRef
[21] Chen, B., Tan, K., Li, K., Ma, B. and Liu, X. (2025) Research on Detection and Counting Method of Green Walnut Based on YOLOv8n-RBP. IEEE Access, 13, 39275-39288. [Google Scholar] [CrossRef
[22] Liu, S., Qi, L., Qin, H., Shi, J. and Jia, J. (2018) Path Aggregation Network for Instance Segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8759-8768. [Google Scholar] [CrossRef
[23] Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B. and Belongie, S. (2017) Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef
[24] Tang, F., Xu, Z., Huang, Q., Wang, J., Hou, X., Su, J., et al. (2023) Duat: Dual-Aggregation Transformer Network for Medical Image Segmentation. In: Liu, Q., et al., Eds., Pattern Recognition and Computer Vision, Springer, 343-356. [Google Scholar] [CrossRef
[25] Zhou, Z., Yu, X. and Wang, X. (2024) Object Detection in Drone Video Based on Recurrent Motion Attention. Pattern Recognition Letters, 183, 56-63. [Google Scholar] [CrossRef