深度学习中的正则化方法研究
Regularization Methods in Deep Learning
DOI: 10.12677/CSA.2020.106126, PDF,   
作者: 武国宁, 胡汇丰, 于萌萌:中国石油大学(北京),理学院数学系,北京
关键词: 深度神经网络过拟合L1正则化L2正则化DropoutMNISTDNN Overfitting L1 Regularization L2 Regularization Dropout MNIST
摘要: 带有百万个参数的神经网络在大量训练集的训练下,很容易产生过拟合现象。一些正则化方法被学者提出以期达到对参数的约束求解。本文总结了深度学习中的L1L2和Dropout正则化方法。最后基于上述正则化方法,进行了MNIST手写体识别对比数值试验。
Abstract: The neural network with millions of parameters can easily be overfitting by large dataset. A wide range of regularization methods have been proposed. In this paper, L1, L2 and Dropout regularization methods are reviewed. Finally, MNIST handwriting recognition experiments using the above regularization methods are conducted for comparisons.
文章引用:武国宁, 胡汇丰, 于萌萌. 深度学习中的正则化方法研究[J]. 计算机科学与应用, 2020, 10(6): 1224-1233. https://doi.org/10.12677/CSA.2020.106126

参考文献

[1] LeCun, Y., Bengio, Y. and Hinton, G. (2015) Deep Learning. Nature, 521, 436-444. [Google Scholar] [CrossRef] [PubMed]
[2] Schmidhuber, J. (2015) Deep Learning in Neural Network: An Overview. Neural Networks, 61, 85-117. [Google Scholar] [CrossRef] [PubMed]
[3] Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 1097-1105.
[4] Hinton, G.E., Osindero, S. and Teh, Y.W. (2006) A Fast Learning Algorithm for Deep Belief Nets. Neural Computation, 18, 1527-1554. [Google Scholar] [CrossRef] [PubMed]
[5] Pearlmutter (1989) Learn-ing State Space Trajectories in Recurrent Neural Networks. International 1989 Joint Conference on Neural Networks, Washington DC, 2, 365-372. [Google Scholar] [CrossRef
[6] Bengio, Y. (2009) Learning Deep Architectures for AI. Foundations and Trends in Machine Learning, 2, 1-127. [Google Scholar] [CrossRef
[7] Bengio, Y., Courville, A. and Vincent, P. (2013) Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 1798-1828. [Google Scholar] [CrossRef] [PubMed]
[8] Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., et al. (2015) Deep Learning Applications and Challenges in Big Data Analytics. Journal of Big Data, 2, 1. [Google Scholar] [CrossRef
[9] LeCun, Y., Bottou, L., Bengio, Y., et al. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278- 2324. [Google Scholar] [CrossRef
[10] Srivastava, N., Hinton, G., Krizhevsky, A., et al. (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15, 1929-1958.