改进特征金字塔网络的Mask RCNN研究综述
A Survey of Mask RCNN Research with Improved Feature Pyramid Network
DOI: 10.12677/CSA.2022.1210238, PDF,  被引量   
作者: 李改俊:天津商业大学理学院,天津;韩建枫:天津商业大学信息工程学院,天津
关键词: 目标检测Mask RCNN特征金字塔特征融合Target Detection Mask RCNN Feature Pyramid Feature Fusion
摘要: 随着计算机视觉的发展,目标检测技术的精度成为科研人员的重要研究内容之一。目标检测分为one-stage和two-stage两种检测方法,其中YOLO,SSD属于一阶段检测,R-CNN (Fast RCNN, Faster RCNN, Mask RCNN)属于两阶段检测。目标检测的精度依赖于特征提取的好坏,特征金字塔是用于检测不同尺度的对象的识别系统中的基本组件。Mask RCNN是基于分割掩码区域建议卷积神经网络的两阶段目标检测算法,精度相对较高,本文主要从它的特征金字塔网络出发,对近几年特征金字塔网络的改进算法进行研究。研究发现:基于原特征金字塔网络引入新的自底向上或自顶向下侧边连接的特征融合路径可以实现对底层信息的充分利用;通过双向、分层跳连等融合方法可以提高小目标检测的正确率。改进算法有效地提高了目标检测的精度。
Abstract: With the development of computer vision, the accuracy of target detection technology has become one of the important research contents of researchers. Target detection is divided into one-stage and two-stage detection methods. Among them, YOLO and SSD belong to one-stage detection, and R-CNN (Fast RCNN, Faster RCNN, Mask RCNN) belongs to two-stage detection. The accuracy of object detection depends on the quality of feature extraction, and feature pyramid is a basic component in a recognition system for detecting objects of different scales. Mask RCNN is a two-stage target detection algorithm based on the segmentation mask area proposal convolutional neural network, with relatively high accuracy. This paper mainly starts from its feature pyramid network, and studies the improved algorithm of feature pyramid network in recent years. The research found that: based on the original feature pyramid network, the introduction of a new bottom-up or top-down side-connected feature fusion path can fully utilize the underlying information; the fusion methods such as bidirectional and hierarchical jump connections can improve accuracy rate of small target detection. The improved algorithm effectively improves the accuracy of target detection.
文章引用:李改俊, 韩建枫. 改进特征金字塔网络的Mask RCNN研究综述[J]. 计算机科学与应用, 2022, 12(10): 2331-2337. https://doi.org/10.12677/CSA.2022.1210238

参考文献

[1] Girshick, R., Donahue, J., Darrell, T., Malik, J., et al. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef
[2] Girshick, R. (2015) Fast R-CNN. IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 1440-1448. [Google Scholar] [CrossRef
[3] He, K.M., Gkioxari, G., Dollar, P., et al. (2017) Mask R-CNN. Pro-ceedings of 2017 IEEE International Conference on Computer Vision ICCV, Venice, 22-29 October 2017, 2980-2988. [Google Scholar] [CrossRef
[4] Ren, S.Q., He, K.M., Girshick, R., et al. (2015) Faster R-CNN: To-wards Real-Time Object Detection with Region Proposal Networks. Proceedings of Advances in Neural Information Processing Systems, Montreal, 7-12 December 2015, 1-9.
[5] Adelson, E.H., Anderson, C.H., Bergen, J.R., Burt, P.J. and Ogden, J.M. (1984) Pyramid Methods in Image Processing. RCA Engineer, 29, 33-41.
[6] He, K., Zhang, X., Ren, S. and Sun, J. (2014) Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. 13th European Conference Proceedings, Zurich, 6-12 September 2014, 346-361. [Google Scholar] [CrossRef
[7] Liu, W., Anguelov, D., Erhan, D., Szegedy, C. and Reed, S. (2016) SSD: Single Shot Multibox Detector. 14th European Conference, Amsterdam, 11-14 October 2016, 21-37. [Google Scholar] [CrossRef
[8] Lin, T.-Y., Dollár, P. and Girshick, R. (2017) Feature Pyramid Networks for Object Detection.
https://arxiv.org/abs/1612.03144
[9] 温尧乐, 李林燕, 尚欣茹, 胡伏原. 一种改进的Mask RCNN特征融合实例分割方法[J]. 计算机应用与软件, 2019, 36(10): 130-133.
[10] 李梁, 董旭彬, 赵清华. 改进Mask R-CNN在航拍灾害检测的应用研究[J]. 计算机工程与应用, 2019, 55(21): 167-176.
[11] 王海云, 王剑平, 张果, 欧阳鑫, 罗付华. 改进FPN的Mask R-CNN工业表面缺陷检测[J]. 制造业自动化, 2020, 42(12): 35-40+97.
[12] 陈敏, 王君, 董明利, 燕必希, 贾欣雨. 改进的Mask R-CNN多尺度实例分割算法研究[J]. 激光杂志, 2020, 41(5): 40-44.
[13] 朱繁, 王洪元, 张继. 基于改进的Mask R-CNN的行人细粒度检测算法[J]. 计算机应用, 2019, 39(11): 3210-3215.
[14] 宣锦昭, 徐超, 冯博, 闪文章. 一种改进的Mask R-CNN图像篡改检测模型[J]. 小型微型计算机系统, 2020, 41(11): 2333-2339.
[15] 江昆鹏, 闫洪涛, 杨红卫, 张庆辉. 改进Mask R-CNN的细粒度车型识别算法[J]. 软件, 2020, 41(3): 1-5.
[16] 喻丽春, 刘金清. 基于改进Mask R-CNN的火焰图像识别算法[J]. 计算机工程与应用, 2020, 56(21): 194-198.
[17] 张超, 文传博基于改进Mask R-CNN的风机叶片缺陷检测[J]. 可再生能源, 2020, 38(9): 1181-1186.
[18] Wang, C.-Y., Bochkovskiy, A. and Liao, H.Y.M. (2021) Scaled-yolov4: Scaling Cross Stage Partial Network. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog-nition, Nashville, 20-25 June 2021, 13029-13038.
[19] Liu, S., Qi, L., Qin, H.F., Shi, J.P. and Jia, J.Y. (2018) Path Aggregation Network for Instance Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, 18-23 June 2018, 8759-8768. [Google Scholar] [CrossRef
[20] 任之俊, 蔺素珍, 李大威, 王丽芳, 左健宏. 基于改进特征金字塔的Mask R-CNN目标检测方法[J]. 激光与光电子学进展, 2019, 56(4): 174-179.
[21] 音松, 陈雪云, 贝学宇. 改进Mask RCNN算法及其在行人实例分割中的应用[J]. 计算机工程, 2021, 47(6): 271-276+283
[22] 余慧明, 周志祥, 彭杨, 崔志斌. 一种基于改进Mask R-CNN模型的遥感图像目标识别方法[J]. 信息技术与网络安全, 2021, 40(3): 38-42+47.
[23] 李森森, 吴清. 改进Mask R-CNN的遥感图像多目标检测与分割[J]. 计算机工程与应用, 2020, 56(14): 183-190.