一种自适应选取步长的随机交替方向乘子法
A Stochastic Alternating Direction Multiplier Method with Adaptive Step Selection
DOI: 10.12677/AAM.2023.129401, PDF, 下载: 191  浏览: 308  国家自然科学基金支持
作者: 李 静, 薛 丹*:青岛大学数学与统计学院,山东 青岛
关键词: 随机优化交替方向乘子法机器学习Stochastic Optimization Alternating Direction Method of Multipliers Machine Learning
摘要: 本文研究了具有可分离变量的凸随机优化问题,提出了一种新的随机交替方向乘子(ADMM)算法。该算法是ADMM与自适应选取步长的随机缩减梯度算法(SVRG-BB)的结合,利用BB步长实现了SVRG-ADMM方法自适应选取步长,而无需再使用递减步长或者手动调节步长。在一般的假设条件下,证明了算法的收敛性。 最后给出相关数值实验表明了算法的有效性。
Abstract: This paper considers the problem of convex stochastic optimization with separable variables. We propose a stochastic alternating direction method of multipliers (AD- MM) algorithm to solve this convex stochastic optimization problem. The algorithm can be roughly regarded as a combination of ADMM and adaptive step stochastic reduced gradient algorithm (SVRG-BB). BB step size is used to realize the adaptive step size selection by SVRG-ADMM method, without decreasing step size or manu- ally adjusting step size. Under general assumptions, the convergence of the algorithm is proved. Finally, numerical experiments are given to show the effectiveness of the algorithm.
文章引用:李静, 薛丹. 一种自适应选取步长的随机交替方向乘子法[J]. 应用数学进展, 2023, 12(9): 4090-4104. https://doi.org/10.12677/AAM.2023.129401

参考文献

[1] Che, M.L., Qi, L.Q. and Wei, Y.M. (2015) Positive-Definite Tensors to Nonlinear Complemen- tarity Problems. Journal of Optimization Theory and Applications, 168, 475-487.
[2] https://doi.org/10.1007/s10957-015-0773-1
[3] Schmidt, M.W., Le Roux, N. and Bach, F.R. (2013) Minimizing Finite Sums with the Stochas- tic Average Gradient. Mathematical Programming, 162, 83-112.
https://doi.org/10.1007/s10107-016-1030-6
[4] Mairal, J., Bach, F.R., Ponce, J. and Sapiro, G. (2009) Online Dictionary Learning for S- parse Coding. Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, 14-18 June 2009, 689-696.
https://doi.org/10.1145/1553374.1553463
[5] Robbins, H.E. (1951) A Stochastic Approximation Method. Annals of Mathematical Statistics, 22, 400-407.
https://doi.org/10.1214/aoms/1177729586
[6] Bottou, L., Curtis, F.E. and Nocedal, J. (2016) Optimization Methods for Large-Scale Machine Learning. SIAM Review, 60, 223-311.
https://doi.org/10.1137/16M1080173
[7] Shalev-Shwartz, S. and Zhang, T. (2013) Stochastic Dual Coordinate Ascent Methods for Regularized Loss. Journal of Machine Learning Research, 14, 567-599.
[8] Le Roux, N., Schmidt, M.W. and Bach, F.R. (2012) A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets. NIPS.
[9] Defazio, A., Bach, F.R. and Lacoste-Julien, S. (2014) SAGA: A Fast Incremental Gradient Method with Support for Non-Strongly Convex Composite Objectives. NIPS.
[10] Johnson, R. and Zhang, T. (2013) Accelerating Stochastic Gradient Descent Using Predictive Variance Reduction. NIPS.
[11] Barzilai, J. and Borwein, J.M. (1988) Two-Point Step Size Gradient Methods. IMA Journal of Numerical Analysis, 8, 141-148.
https://doi.org/10.1093/imanum/8.1.141
[12] Tan, C.H., Ma, S.Q., Dai, Y.H. and Qian, Y.Q. (2016) Barzilai-Borwein Step Size for Stochastic Gradient Descent. NIPS.
[13] Yu, T.T., Liu, X.W., Dai, Y.H. and Sun, J. (2021) Stochastic Variance Reduced Gradient Methods Using a Trust-Region-Like Scheme. Journal of Scientific Computing, 87, Article No. 5.
https://doi.org/10.1007/s10915-020-01402-x
[14] Boyd, S.P., Parikh, N., Chu, E.K.-W., Peleato, B. and Eckstein, J. (2011) Distributed Opti- mization and Statistical Learning via the Alternating Direction Method of Multipliers. Foun- dations and Trends in Machine Learning, 3, 1-122.
https://doi.org/10.1561/2200000016
[15] Suzuki, T. (2013) Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method. Proceedings of the 30th International Conference on Machine Learning, 28, 392-400.
https://dl.acm.org/doi/10.5555/3042817.3042863
[16] Ouyang, H., He, N., Tran, L. and Gray, A.G. (2013) Stochastic Alternating Direction Method of Multipliers. Proceedings of the 30th International Conference on Machine Learning, 28, 80-88.
https://dl.acm.org/doi/10.5555/3042817.3042828
[17] Suzuki, T. (2014) Stochastic Dual Coordinate Ascent with Alternating Direction Method of Multipliers. Proceedings of the 31st International Conference on Machine Learning, 32, 736- 744.
https://dl.acm.org/doi/10.5555/3044805.3044889
[18] Zhong, W.L. and Kwok, J. (2013) Fast Stochastic Alternating Direction Method of Multipliers. Proceedings of the 31st International Conference on Machine Learning, 32, 46-54.
[19] Zheng, S. and Kwok, J.T.-Y. (2016) Fast-and-Light Stochastic ADMM. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2407-2413.
https://dl.acm.org/doi/10.5555/3060832.3060958
[20] Liu, Y.Y., Shang, F.H. and Cheng, J. (2017) Accelerated Variance Reduced Stochastic ADMM. Proceedings of the 31th AAAI Conference on Artificial Intelligence, San Francisco, California, 2287-2293.
https://dl.acm.org/doi/10.5555/3298483.3298569
[21] [20] Liu, Y.Y., Shang, F.H., Liu, H.Y., Kong, L., Jiao, L.C. and Lin, Z.C. (2020) Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 4242-4255.
https://doi.org/10.1109/TPAMI.2020.3000512
[22] Zhang, X.Q., Burger, M. and Osher, S. (2010) A Unified Primal-Dual Algorithm Framework Based on Bregman Iteration. Journal of Scientific Computing, 46, 20-46.
https://doi.org/10.1007/s10915-010-9408-8
[23] Xiao, L. and Zhang, T. (2014) A Proximal Stochastic Gradient Method with Progressive Variance Reduction. SIAM Journal on Optimization, 24, 2057-2075.
https://doi.org/10.1137/140961791