基于改进的Faster R-CNN的目标检测与识别
Target Detection and Recognition Based on Improved Faster R-CNN
DOI: 10.12677/JISP.2019.82007, PDF,  被引量    科研立项经费支持
作者: 房靖晶, 成金勇:齐鲁工业大学(山东省科学院),计算机科学与技术学院,山东 济南
关键词: 深度学习目标检测区域建议网络特征提取Deep Learning Object Detections Region Proposal Network Feature Extraction
摘要: 近年来,随着深度学习不断的发展,基于深度学习的图像研究与应用已经在很多领域取得了优异的成绩。RCNN网络与全卷积网络等技术框架使得目标检测技术发展越来越迅速。Faster R-CNN算法被提出并广泛应用于目标检测和目标识别领域。在本文中,主要研究了基于Faster R-CNN算法对自制办公用品数据集中的图像进行的目标检测。相较于RCNN系列算法,Faster R-CNN提出了区域建议网络,同时将特征抽取、候选框提取、边界框回归、分类整合到一个网络当中,使得综合性能有很大改进。本文提出基于AlexNet改进的Faster R-CNN算法,在提取特征时,数据集通常具有大量高密度的连续性特征,而激活函数具有稀疏性,解决了目标小且背景复杂情况下的办公用品目标检测问题,提高了检测速度和检测精度。
Abstract: In recent years, with the continuous development of in-depth learning, image research and appli-cation based on in-depth learning has achieved excellent results in many fields. RCNN network and full convolution network make the development of target detection technology more and more rapid. Faster R-CNN algorithm has been proposed and widely used in the field of target detection and recognition. In this paper, we mainly study the object detection based on Faster R-CNN algo-rithm for the image in the data set of self-made office supplies. Compared with RCNN series algo-rithms, Faster R-CNN proposes a regional recommendation network, and integrates feature ex-traction, candidate box extraction, boundary box regression and classification into a network, which greatly improves the overall performance. In this paper, an improved Faster R-CNN algorithm based on activation function is proposed. When extracting features, the data set usually has a large number of high-density continuity characteristics, while the activation function is sparse, which solves the problem of target detection of office supplies under small targets and complex background, and improves the detection speed and accuracy.
文章引用:房靖晶, 成金勇. 基于改进的Faster R-CNN的目标检测与识别[J]. 图像与信号处理, 2019, 8(2): 43-50. https://doi.org/10.12677/JISP.2019.82007

参考文献

[1] Chavali, N., Agrawal, H., Mahendru, A., et al. (2016) Object-Proposal Evaluation Protocol Is “Gameable”. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Sunday, 26 June-1 July 2016, 835-844.
[2] Xie, S., Girshick, R., Dollár, P., et al. (2017) Aggregated Residual Transformations for Deep Neural Networks. Conference on Computer Vision and Pattern Recognition, 21-26 July 2017, 5987-5995.
[3] Dai, J., He, K. and Sun, J. (2015) Convolutional Feature Masking for Joint Object and Stuff Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 3992-4000.
[4] Ren, S., He, K., Girshick, R., et al. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Pro-posal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149.
[Google Scholar] [CrossRef
[5] Hosang, J., Benenson, R., Dollár, P., et al. (2016) What Makes for Effective Detection Proposals? IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 814-830.
[Google Scholar] [CrossRef
[6] Pinheiro, P.O., Collobert, R. and Dollár, P. (2015) Learning to Segment Object Candidates. Advances in Neural Information Processing Systems, Montreal, 7-12 December 2015, 1990-1998.
[7] Liu, W., Anguelov, D., Erhan, D., et al. (2016) Ssd: Single Shot Multibox Detector. In: European Conference on Computer Vision, Springer, Cham, 21-37.
[8] Girshick, R. (2015) Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Araucano Park, 11-18 December 2015, 1440-1448.
[9] Long, J., Shelhamer, E. and Darrell, T. (2015) Fully Convolutional Networks for Se-mantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 1.
[10] Uijlings, J.R.R., Van De Sande, K.E.A., Gevers, T., et al. (2013) Selective Search for Object Recognition. International Journal of Computer Vision, 104, 154-171.
[Google Scholar] [CrossRef
[11] Hariharan, B., Arbeláez, P., Girshick, R., et al. (2014) Simultaneous Detection and Segmentation. In: European Conference on Computer Vision, Springer, Cham, 297-312.
[12] Erhan, D., Szegedy, C., Toshev, A., et al. (2014) Scalable Object Detection Using Deep Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 2147-2154.
[13] Szegedy, C., Ioffe, S., Vanhoucke, V., et al. (2017) Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning.
[14] Wang, X., Girshick, R., Gupta, A., et al. (2018) Non-Local Neural Networks. The IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-22 June 2018, Vol. 1, 4.
[15] Wei, S.E., Ramakrishna, V., Kanade, T., et al. (2016) Convolutional Pose Machines. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 4724-4732.
[16] Neubeck, A. and Van Gool, L. (2006) Efficient Non-Maximum Suppression. 18th International Conference on Pattern Recognition, Hong Kong, 20-24 August 2006, Vol. 3, 850-855.
[17] Girshic, R., Donahue, J., Darrell, T., et al. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587.
[18] Redmon, J., Divvala, S., Girshick, R., et al. (2016) You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788.
[19] Zeiler, M.D. and Fergus, R. (2014) Visualizing and Understanding Convolutional Networks. In: European Conference on Computer Vision, Springer, Cham, 818-833.
[20] Gulcehre, C., Moczulski, M., Denil, M., et al. (2016) Noisy Activation Functions. International Conference on Machine Learning, 48, 3059-3068.