深度学习模型各层参数数目对于性能的影响
The Influence of the Amount of Parameters in Different Layers on the Performance of Deep Learning Models
DOI: 10.12677/CSA.2015.512056, PDF, HTML, XML,  被引量 下载: 3,298  浏览: 12,293  国家自然科学基金支持
作者: 岳喜斌, 唐亮*:北京林业大学工学院,北京 ;胡晓林:清华大学计算机科学与技术系,清华信息科学与技术国家实验室,北京
关键词: 卷积神经网络递归卷积神经网络深度学习Convolutional Neural Network Recurrent Convolutional Neural Network Deep Learning
摘要: 近年来深度学习在图像识别、语音识别等领域得到了广泛的应用,取得了优异的效果,但深度学习网络的结构设计没有一般规律可循。本文基于卷积神经网络和递归卷积神经网络模型探究了深度学习网络不同层级间参数分布对网络性能的影响,在CIFAR-10、CIFAR-100和SVHN数据集上进行了大量的实验。结果表明:在保证网络总参数大致相等并稳定在饱和的临界值附近的条件下,增加高层参数数量的能够提升网络性能,而增加低层参数数量的会降低网络性能。通过这一简单的规则,我们设计的递归卷积神经网络模型结构在CIFAR-100和SVHN两个数据集上达到了目前单模型最低的识别错误率。
Abstract: In recent years, deep learning has been widely used in many pattern recognition tasks including image classification and speech recognition due to its excellent performance. But a general rule for the structure design is lacked. We explored the influence of the amount of parameters in different layers of two deep learning models, convolutional neural network (CNN) and recurrent convolutional neural network (RCNN). Experiments on three benchmark datasets, CIFAR-10, CIFAR-100 and SVHN showed that when the total number of parameters was fixed, increasing the number of parameters in higher layers could boost the performance of the models while increasing the number of parameters in lower layers could be harmful to the performance of the models. Based on this simple rule, we obtained the state-of-the-art classification accuracy on CIFAR-100 and SVHN with single models.
文章引用:岳喜斌, 胡晓林, 唐亮. 深度学习模型各层参数数目对于性能的影响[J]. 计算机科学与应用, 2015, 5(12): 445-453. http://dx.doi.org/10.12677/CSA.2015.512056

参考文献

[1] Lecun, Y., Boser, B.E., Denker, J.S., et al. (1989) Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1, 541-551.
http://dx.doi.org/10.1162/neco.1989.1.4.541
[2] Lecun, Y., Boser, B.E., Denker, J.S., et al. (1990) Handwritten Digit Recognition with a Back-Propagation Network. Advances in Neural Information Processing Systems, 396-404.
[3] Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 25.
[4] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A. (2014) Going Deeper with Convolutions.
[5] Chatfield, K., Simonyan, K., Vedaldi, A., et al. (2014) Return of the Devil in the Details: Delving Deep into Convolutional Nets.
[6] 葛詹尼加, 等, 著. 周晓林, 高国定, 等, 译. 认知神经科学: 关于心智的生物学[M]. 北京: 中国轻工业出版社, 2011: 141-177.
[7] Babadi, B. and Sompolinsky, H. (2014) Sparseness and Expansion in Sensory Representations. Neuron, 83, 1213-1226.
http://dx.doi.org/10.1016/j.neuron.2014.07.035
[8] Liang, M. and Hu, X. (2015) Recurrent Convolutional Neural Network for Object Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-tion.
[9] Dayan, P. and Abbott, L.F. (2001) Theoretical Neuroscience. MIT Press, Cambridge.
[10] Liang, M., Hu, X. and Zhang, B. (2015) Convolutional Neural Networks with Intra-Layer Recurrent Connections for Scene Labeling. Advances in Neural Information Processing (NIPS), 7-12 December 2015, Montréal.
[11] Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A.C. and Bengio, Y. (2013) Maxout Networks. Proceedings of the 30th In-ternational Conference on Machine Learning (ICML), 1319-1327.
[12] Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) Imagenet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems (NIPS), 1097-1105.
[13] Lin, M., Chen, Q. and Yan, S. (2014) Network in Network. International Conference on Learning Representations (ICLR).
[14] Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z. and Tu, Z. (2014) Deeply Supervised Nets. Advances in Neural Information Processing Systems (NIPS), Deep Learning and Representation Learning Workshop.
[15] Xu, B., et al. (2015) Empirical Evaluation of Rectified Activations in Convolutional Network.
[16] Greff, S.K. and Schmidhuber, J. (2015) Highway Networks. Advances in Neural Information Processing (NIPS), Montréal, 7-12 December 2015.