基于多区域特征的深度卷积神经网络模型
Deep Convolutional Neural Network Model Based on Multi-Region Feature
摘要: 针对现有的深度卷积神经网络模型存在的网络结构复杂、计算量较大等问题,从而无法在实际中有较广泛的应用,该文提出一种基于多区域特征的深度卷积神经网络模型。模型首先对图像进行多区域划分,然后用标准卷积操作得到图像语义上下文信息,接着利用多区域的输入来学习上下文交互特征,通过把全局区域和多个分区域的空间信息级联再输入卷积层,以一种信息补充的方式提取图像的上下文特征信息,最后通过Softmax函数对图像进行分类。实验结果表明,该模型结构简单,参数量较少,且多区域特征融合上下文信息建模比单区域特征建模具有更好的鲁棒性和更高的分类精度。
Abstract: Aiming at the problems of complex network structure and large computational complexity of existing deep convolutional neural network models, it cannot be widely used in practice. This paper proposes a deep convolutional neural network model based on multi-region features. Firstly, the model divides the image into multiple regions and then uses the standard convolution operation to get the image semantic context information. Then, the multi-region input is used to learn the context interaction feature. By cascading and inputting the spatial information of the global region and the multiple sub-regions into the convolution layer, the context feature information of the image is extracted in an information supplement manner. Finally, the images are classified by the Softmax function. The experimental results show that the model has simple structure and few parameters, and the modeling with multi-region feature and context information fusion has better robustness and higher classification accuracy than that with single-region feature.
文章引用:王雅湄, 王振友. 基于多区域特征的深度卷积神经网络模型[J]. 应用数学进展, 2019, 8(11): 1753-1765. https://doi.org/10.12677/AAM.2019.811205

参考文献

[1] Krizhevsky, A., Sutskever, I. and Hinton, G. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 25, 1097-1105.
[2] Simonyan, K. and Zisserman, A. (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition. International Conference on Learning Representations (ICLR).
[3] Szegedy, C., Liu, W., Jia, Y., et al. (2015) Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 7-12 June 2015, 1-9.
[Google Scholar] [CrossRef
[4] He, K., Zhang, X., Ren, S. and Sun, J. (2015) Deep Residual Learning for Image Recognition. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 27-30 June 2016, 770-778.
[Google Scholar] [CrossRef
[5] Xie, S., Ross, G., Dollar, P., Tu, Z. and He, K. (2017) Aggregated Residual Transformations for Deep Neural Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 21-26 July 2017, 1492-1500.
[Google Scholar] [CrossRef
[6] Huang, G., Liu, Z., Maaten, L.V.D. and Weinberger, K.Q. (2017) Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 21-26 July 2017, 4700-4708.
[Google Scholar] [CrossRef
[7] Chen, Y., Li, J., Xiao, H., et al. (2017) Dual Path Networks. Neural Information Processing Systems (NIPS).
[8] Hu, J., Shen, L. and Sun, G. (2018) Squeeze-and-Excitation Networks. 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, 18-23 June 2018, 7132-714.
[Google Scholar] [CrossRef
[9] Zhang, T., Qi, G.J., Xiao, B. and Wang, J. (2017) Interleaved Group Convolutions. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22-29 October 2017, 4373-4382.
[Google Scholar] [CrossRef
[10] Xie, G., Wang, J., Zhang, T., et al. (2018) IGCV$2$: Interleaved Structured Sparse Convolutional Neural Networks. 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, 18-23 June 2018, 8847-8856.
[Google Scholar] [CrossRef
[11] Sun, K. (2018) IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks. The British Machine Vision Conference (BMVC).
[12] Taigman, Y., Yang, M., Ranzato, M. and Wolf, L. (2014) Deep Face: Closing the Gap to Human-Level Performance in Face Verification. 2014 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, 23-28 June 2014, 1701-1708.
[Google Scholar] [CrossRef
[13] Girshick, R., Donahue, J., Darrelland, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 23-28 June 2014, 580-587.
[Google Scholar] [CrossRef
[14] Wang, N. and Yeung, D.Y. (2013) Learning a Deep Compact Image Representation for Visual Tracking. International Conference on Neural Information Processing Systems (NIPS).
[15] Deng, J., Dong, W., Socher, R., et al. (2009) ImageNet: A Large-Scale Hierarchical Image Database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, 20-25 June 2009, 248-255.
[Google Scholar] [CrossRef
[16] Ioffe, S. and Szegedy, C. (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. International Conference on Machine Learning, 1-11.
[17] Ramachandran, P., Zoph, B. and Le, Q.V. (2017) Searching for Activation Functions. Computer Science, 1-13.
[18] Lin, M., Chen, Q. and Yan, S. (2013) Network in Network. Computer Science, 1-10.
[19] Chan, T.H., Jia, K., Gao, S., et al. (2015) PCANet: A Simple Deep Learning Baseline for Image Classification? IEEE Transactions on Image Processing, 24, 5017-5032.
[Google Scholar] [CrossRef
[20] Zeiler, M.D. and Fergus, R. (2013) Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. International Conference on Learning Representations (ICLR), 1-9.
[21] Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A. and Bengio, Y. (2013) Maxout Networks. Computer Science, 1319-1327.