Barzilai-Borwein型算法的探讨
Discussion on Barzilai-Borwein Algorithm
摘要: 本文介绍了八种负梯度算法,根据特点对其进行了比较,并对不同维数的严格凸二次函数进行了计算,绘制图表观察数据,发现BB型算法更具优势,维数越大优势越大。选取不同的初始步长可以改变算法的效果,特别是当矩阵条件数越大时,初始步长的选取越关键,本文考虑了四种选择初始步长的方法,分别将其放入算法中进行数值实验,结果表明,选取Hessian矩阵最小特征值的倒数效果最好。最后,介绍了负梯度算法在深度学习中的应用。
Abstract: This paper introduces eight kinds of negative gradient algorithms, compares them according to their characteristics, calculates strictly convex quadratic functions of different dimensions, draws graphs and observes data, and finds that BB algorithm is more advantageous, the larger the dimension is, the greater the advantage is. Selecting different initial step size can change the effect of the algorithm, especially when the number of matrix conditions is larger, the selection of the initial step size is more critical. This paper considers four methods of selecting the initial step size, and puts them into the algorithm for numerical experiments. The results show that the reciprocal of the minimum eigenvalue of the Hessian matrix is the best. Finally, the application of negative gradient algorithm in deep learning is introduced.
文章引用:黄亚楠. Barzilai-Borwein型算法的探讨[J]. 应用数学进展, 2022, 11(4): 2242-2258. https://doi.org/10.12677/AAM.2022.114238

参考文献

[1] Barzilai, J. and Borwein, J.M. (1988) Two-Point Step Size Gradient Methods. IMA Journal of Numerical Analysis, 8, 141-148. [Google Scholar] [CrossRef
[2] Cauchy, A. (1847) Méthode générale pour la résolution des systemes déquations simultanées. Comptes rendus de l’Académie des Sciences Paris, 25, 536-538.
[3] Raydan, M. (1993) On the Barzilai and Borwein Choice of Steplength for the Gradient Method. IMA Journal of Numerical Analysis, 13, 321-326. [Google Scholar] [CrossRef
[4] Dai, Y.H. and Liao, L.Z. (2002) R-Linear Convergence of the Barzilai and Borwein Gradient Method. IMA Journal of Numerical Analysis, 22, 1-10. [Google Scholar] [CrossRef
[5] Roger, F. (2005) On the Barzilakborwein. Optimization and Control with Applications, 96, 235-256.
[6] Dai, Y.H. (2003) Alternate Step Gradient Method. Optimization, 52, 395-415. [Google Scholar] [CrossRef
[7] Zhou, B., Gao, L. and Dai, Y.H. (2006) Gradient Methods with Adaptive Step-Sizes. Computational Optimization and Applications, 35, 69-86. [Google Scholar] [CrossRef
[8] Wan, Z., Guo, J., Liu, J. and Liu, W. (2018) A Modified Spectral Conjugate Gradient Projection Method for Signal Recovery. Signal, Image and Video Processing, 12, 1455-1462. [Google Scholar] [CrossRef
[9] Sopyła, K. and Drozda, P. (2015) Stochastic Gradient Descent with Barzilai-Borwein Update Step for SVM. Information Sciences, 316, 218-233. [Google Scholar] [CrossRef
[10] Chen, X., Liu, Y. and Wan, Z. (2016) Optimal Decision Making for Online and Offline Retailers under BOPS Mode. The ANZIAM Journal, 58, 187-208. [Google Scholar] [CrossRef
[11] Li, Y., Wan, Z. and Liu, J. (2017) Bi-Level Programming Approach to Optimal Strategy for Vendor-Managed Inventory Problems under Random Demand. The ANZIAM Journal, 59, 247-270. [Google Scholar] [CrossRef
[12] Zhang, X., Huang, S. and Wan, Z. (2016) Optimal Pricing and Ordering in Global Supply Chain Management with Constraints under Random Demand. Applied Mathematical Modelling, 40, 10105-10130. [Google Scholar] [CrossRef
[13] Li, T. and Wan, Z. (2019) New Adaptive Barzilai-Borwein Step Size and Its Application in Solving Large-Scale Optimization Problems. The ANZIAM Journal, 61, 76-98. [Google Scholar] [CrossRef
[14] 李化欣, 潘晋孝. 最速下降法在图像重建中的应用[J]. 科技情报开发与经济, 2006, 16(3): 155-156.
[15] Farsiu, S., Robinson, M.D., Elad, M. and Milanfar, P. (2004) Fast and Robust Multiframe Super Resolution. IEEE Transactions on Image Processing, 13, 1327-1344. [Google Scholar] [CrossRef