基于信息扩大集合和自适应特征融合的遥感目标检测
Remote Sensing Target Detection Based on Information Expansion Collection and Adaptive Feature Fusion
摘要: 在现有方法的基础上,本文提出了一种新颖的信息扩大集合和自适应特征融合的检测算法。文中在主干网络部分引入ConvNeXt模块来加强对被遮蔽目标的检测能力,提出了信息扩大集合模块来充分地利用图像中的上下文信息,优化对长宽比较大目标的检测效果。使用协调注意力模块来防止目标位置信息的丢失,通过挤压与激励模块对通道重新进行权重分配,挑选出重要性较高的通道进行计算。文中还使用自适应空间特征融合模块,对不同层级的特征图进行融合来保证金字塔的效果。在DOTA-v1.5遥感数据集上,相较于原始网络,文中方法mAP@0.5性能提升了2.5个百分点。在另外的2个数据集上,本文提出的算法也取得了更好的检测性能。
Abstract: On the basis of existing methods, this paper proposes a novel detection algorithm based on Information Expansion Collection and Adaptive Feature Fusion. The ConvNeXt module is introduced in the backbone network part to enhance the detection ability of occluded targets; the Information Expansion Collection module is proposed to fully utilize the contextual information in the image to optimize the detection effect of targets with large aspect ratio; and the Coordinated Attention module is used to prevent the loss of target position information. The Squeezing and Excitation module is used to re-assign weights to the channels and select the channels with higher importance for computation; the Adaptive Spatial Feature Fusion module is used to fuse the feature maps of different layers to ensure the effect of pyramid. Finally, on the DOTA-v1.5 remote sensing dataset, compared with the original network, the mAP@0.5 performance improvement of 2.5% was obtained. The algorithm proposed in this paper also achieved better detection performance compared to the current state-of-the-art detection algorithms on the other 2 datasets.
文章引用:王文聪, 张孙杰. 基于信息扩大集合和自适应特征融合的遥感目标检测[J]. 软件工程与应用, 2025, 14(2): 484-498. https://doi.org/10.12677/sea.2025.142043

参考文献

[1] 陈磊, 张孙杰, 王永雄. 基于改进的YOLOv3及其在遥感图像中的检测[J]. 小型微型计算机系统, 2020, 41(11): 2321-2324.
[2] Wang, Q., Gao, J. and Li, X. (2019) Weakly Supervised Adversarial Domain Adaptation for Semantic Segmentation in Urban Scenes. IEEE Transactions on Image Processing, 28, 4376-4386. [Google Scholar] [CrossRef] [PubMed]
[3] Jin, R. and Lin, D. (2020) Adaptive Anchor for Fast Object Detection in Aerial Image. IEEE Geoscience and Remote Sensing Letters, 17, 839-843. [Google Scholar] [CrossRef
[4] Gerhards, M., Schlerf, M., Mallick, K. and Udelhoven, T. (2019) Challenges and Future Perspectives of Multi-/Hyperspectral Thermal Infrared Remote Sensing for Crop Water-Stress Detection: A Review. Remote Sensing, 11, Article 1240. [Google Scholar] [CrossRef
[5] Li, X., Chen, M., Nie, F., et al. (2017) A Multiview-Based Parameter Free Framework for Group Detection. AAAI Conference on Artificial Intelligence, San Francisco, 4-9 February 2017, 4147-4153.
[6] Zheng, Q., Ma, J., Liu, M., Liu, Y., Li, Y. and Shi, G. (2022) Lightweight Hot-Spot Fault Detection Model of Photovoltaic Panels in UAV Remote-Sensing Image. Sensors, 22, Article 4617. [Google Scholar] [CrossRef] [PubMed]
[7] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., et al. (2016) SSD: Single Shot Multibox Detector. Computer Vision—ECCV 2016, Amsterdam, 11-14 October 2016, 21-37. [Google Scholar] [CrossRef
[8] Lin, T., Goyal, P., Girshick, R., He, K. and Dollar, P. (2017) Focal Loss for Dense Object Detection. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2999-3007. [Google Scholar] [CrossRef
[9] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[10] Ren, S., He, K., Girshick, R. and Sun, J. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef] [PubMed]
[11] Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B. and Belongie, S. (2017) Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef
[12] Liu, S., Qi, L., Qin, H., Shi, J. and Jia, J. (2018) Path Aggregation Network for Instance Segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8759-8768. [Google Scholar] [CrossRef
[13] 熊娟, 张孙杰, 阚亚亚, 等. 基于CAFPN和细化双头解耦的遥感图像目标检测[J]. 应用科学学报, 2023, 41(6): 989-1003.
[14] Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I. and Savarese, S. (2019) Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 658-666. [Google Scholar] [CrossRef
[15] Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R. and Ren, D. (2020) Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000. [Google Scholar] [CrossRef
[16] Zhang, Y., Ren, W., Zhang, Z., Jia, Z., Wang, L. and Tan, T. (2022) Focal and Efficient IoU Loss for Accurate Bounding Box Regression. Neurocomputing, 506, 146-157. [Google Scholar] [CrossRef
[17] Liu, Z., Mao, H., Wu, C., Feichtenhofer, C., Darrell, T. and Xie, S. (2022) A Convnet for the 2020s. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 11966-11976. [Google Scholar] [CrossRef
[18] Tolstikhin, I.O., Houlsby, N., Kolesnikov, A., et al (2021) MLP-Mixer: An All-MLP Architecture for Vision. Advances in Neural Information Processing Systems, 34, 24261-24272.
[19] Wang, J., Chen, K., Yang, S., Loy, C.C. and Lin, D. (2019) Region Proposal by Guided Anchoring. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 2960-2969. [Google Scholar] [CrossRef
[20] Zhu, C., He, Y. and Savvides, M. (2019) Feature Selective Anchor-Free Module for Single-Shot Object Detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 840-849. [Google Scholar] [CrossRef
[21] Li, Y., Chen, Y., Wang, N. and Zhang, Z. (2019) Scale-Aware Trident Networks for Object Detection. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 6053-6062. [Google Scholar] [CrossRef
[22] Xia, G., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., et al. (2018) DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 3974-3983. [Google Scholar] [CrossRef
[23] 禹文奇, 程塨, 王美君, 等. MAR20: 遥感图像军用飞机目标识别数据集[J]. 遥感学报, 2023, 27(12): 2688-2696.
[24] Cheng, G., Zhou, P. and Han, J. (2016) Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 54, 7405-7415. [Google Scholar] [CrossRef