基于改进SSD算法的小目标检测与应用
Small Object Detection and Application Based on Improved SSD Algorithm
DOI: 10.12677/CSA.2021.114109, PDF,   
作者: 刘 洋, 战荫伟:广东工业大学计算机学院,广东 广州
关键词: SSD深度学习小目标检测SSD Deep Learning Small Object Detection
摘要: 针对通用目标检测方法在复杂环境下检测小目标时效果不佳、漏检率高等问题,本文对SSD小目标检测算法进行改进。利用训练损失的反馈作为判断条件,结合数据增强提高模型对复杂环境的抗干扰能力,降低小目标的漏检率,在网络中引入注意力机制,增加SENet (Squeeze-and-Excitation)模块,对模型中的特征通道进行权重重分配,对无效的特征权重进行抑制,提升有用的特征权重占比。实验结果表明,相比原SSD算法,改进的SSD算法在不引入过多计算量的情况下,能够有效弥补训练过程中小目标监督不到位的不足,在VOC数据集和工地安全帽佩戴数据集上,精度都得到了明显提升。
Abstract: To address the problems of ineffective detection of small objects and high miss detection rate of generic object detection methods in complex environments, this paper is to improve SSD small object detection algorithm. In order to avoid inadequate supervision of small objects in the training process, the feedback of training loss is used as the judgment condition, combined with data enhance-ment to improve the anti-interference ability of the model in complex environments, reducing the miss detection rate of small targets. The attention mechanism is introduced in the network and the SENet (Squeeze-and-Excitation) module is added to redistribute the weights of the feature channels in the model, which suppress the invalid feature weights and increase the percentage of useful feature weights. The experimental results show that compared with the original SSD algorithm, the improved SSD algorithm significantly improves the detection accuracy on both the VOC dataset and the safety helmet wearing dataset without too much computational effort.
文章引用:刘洋, 战荫伟. 基于改进SSD算法的小目标检测与应用[J]. 计算机科学与应用, 2021, 11(4): 1061-1069. https://doi.org/10.12677/CSA.2021.114109

参考文献

[1] Zou, Z., Shi, Z., Guo, Y. and Ye, J. (2019) Object Detection in 20 Years: A Survey. arXiv:1905.05055.
[2] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef
[3] Girshick, R. (2015) Fast R-CNN. Pro-ceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, 7-13 December 2015, 1440-1448. [Google Scholar] [CrossRef
[4] Ren, S., He, K., Girshick, R. and Sun, J. (2015) Faster R-CNN: To-wards Real-Time Object Detection with Region Proposal Networks. 28th Conference on Neural Information Processing Systems, Montreal, 8-13 December 2014, 91-99.
[5] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[6] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., et al. (2016) SSD: Single Shot Multibox Detector. European Conference on Computer Vision, Amsterdam, 8-16 October 2016, 21-37. [Google Scholar] [CrossRef
[7] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ra-manan, D., et al. (2014) Microsoft Coco: Common Objects in Context. European Conference on Computer Vision, Zur-ich, 6-12 September 2014, 740-755. [Google Scholar] [CrossRef
[8] Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fa-thi, A., et al. (2017) Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 3296-3297. [Google Scholar] [CrossRef
[9] Fu, C.Y., Liu, W., Ranga, A., Tyagi, A. and Berg, A.C. (2017) DSSD: Deconvolutional Single Shot Detector. arXiv: 1701.06659.
[10] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pat-tern Recognition, Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[11] Simonyan, K. and Zisserman, A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv: 1409.1556.
[12] Singh, B., Najibi, M. and Davis, L.S. (2018) SNIPER: Efficient Multi-Scale Training. 32nd Conference on Neural Information Processing Systems, Montreal, 3-8 December 2018, 9310-9320.
[13] Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B. and Belongie, S. (2017) Feature Pyramid Networks for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef
[14] Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M. (2020) YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv: 2004.10934.
[15] Yun, S., Han, D., Chun, S., Oh, S.J., Yoo, Y. and Choe, J. (2019) CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. Interna-tional Conference on Computer Vision, Seoul, 27 October-2 November 2019, 6022-6031. [Google Scholar] [CrossRef
[16] Hu, J., Shen, L. and Sun, G. (2018) Squeeze-and-Excitation Net-works. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7132-7141. [Google Scholar] [CrossRef