一种自适应选取步长的随机交替方向乘子法

doi:10.12677/AAM.2023.129401

期刊菜单

一种自适应选取步长的随机交替方向乘子法
A Stochastic Alternating Direction Multiplier Method with Adaptive Step Selection

DOI: 10.12677/AAM.2023.129401, PDF, 国家自然科学基金支持
作者: 李静, 薛丹^*：青岛大学数学与统计学院，山东青岛
关键词: 随机优化；交替方向乘子法；机器学习；Stochastic Optimization； Alternating Direction Method of Multipliers； Machine Learning

摘要: 本文研究了具有可分离变量的凸随机优化问题，提出了一种新的随机交替方向乘子(ADMM)算法。该算法是ADMM与自适应选取步长的随机缩减梯度算法(SVRG-BB)的结合，利用BB步长实现了SVRG-ADMM方法自适应选取步长，而无需再使用递减步长或者手动调节步长。在一般的假设条件下，证明了算法的收敛性。最后给出相关数值实验表明了算法的有效性。

Abstract: This paper considers the problem of convex stochastic optimization with separable variables. We propose a stochastic alternating direction method of multipliers (AD- MM) algorithm to solve this convex stochastic optimization problem. The algorithm can be roughly regarded as a combination of ADMM and adaptive step stochastic reduced gradient algorithm (SVRG-BB). BB step size is used to realize the adaptive step size selection by SVRG-ADMM method, without decreasing step size or manu- ally adjusting step size. Under general assumptions, the convergence of the algorithm is proved. Finally, numerical experiments are given to show the effectiveness of the algorithm.

文章引用：李静, 薛丹. 一种自适应选取步长的随机交替方向乘子法[J]. 应用数学进展, 2023, 12(9): 4090-4104. https://doi.org/10.12677/AAM.2023.129401

参考文献

[1]	Che, M.L., Qi, L.Q. and Wei, Y.M. (2015) Positive-Definite Tensors to Nonlinear Complemen- tarity Problems. Journal of Optimization Theory and Applications, 168, 475-487.
[2]	https://doi.org/10.1007/s10957-015-0773-1
[3]	Schmidt, M.W., Le Roux, N. and Bach, F.R. (2013) Minimizing Finite Sums with the Stochas- tic Average Gradient. Mathematical Programming, 162, 83-112. https://doi.org/10.1007/s10107-016-1030-6
[4]	Mairal, J., Bach, F.R., Ponce, J. and Sapiro, G. (2009) Online Dictionary Learning for S- parse Coding. Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, 14-18 June 2009, 689-696. https://doi.org/10.1145/1553374.1553463
[5]	Robbins, H.E. (1951) A Stochastic Approximation Method. Annals of Mathematical Statistics, 22, 400-407. https://doi.org/10.1214/aoms/1177729586
[6]	Bottou, L., Curtis, F.E. and Nocedal, J. (2016) Optimization Methods for Large-Scale Machine Learning. SIAM Review, 60, 223-311. https://doi.org/10.1137/16M1080173
[7]	Shalev-Shwartz, S. and Zhang, T. (2013) Stochastic Dual Coordinate Ascent Methods for Regularized Loss. Journal of Machine Learning Research, 14, 567-599.
[8]	Le Roux, N., Schmidt, M.W. and Bach, F.R. (2012) A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets. NIPS.
[9]	Defazio, A., Bach, F.R. and Lacoste-Julien, S. (2014) SAGA: A Fast Incremental Gradient Method with Support for Non-Strongly Convex Composite Objectives. NIPS.
[10]	Johnson, R. and Zhang, T. (2013) Accelerating Stochastic Gradient Descent Using Predictive Variance Reduction. NIPS.
[11]	Barzilai, J. and Borwein, J.M. (1988) Two-Point Step Size Gradient Methods. IMA Journal of Numerical Analysis, 8, 141-148. https://doi.org/10.1093/imanum/8.1.141
[12]	Tan, C.H., Ma, S.Q., Dai, Y.H. and Qian, Y.Q. (2016) Barzilai-Borwein Step Size for Stochastic Gradient Descent. NIPS.
[13]	Yu, T.T., Liu, X.W., Dai, Y.H. and Sun, J. (2021) Stochastic Variance Reduced Gradient Methods Using a Trust-Region-Like Scheme. Journal of Scientific Computing, 87, Article No. 5. https://doi.org/10.1007/s10915-020-01402-x
[14]	Boyd, S.P., Parikh, N., Chu, E.K.-W., Peleato, B. and Eckstein, J. (2011) Distributed Opti- mization and Statistical Learning via the Alternating Direction Method of Multipliers. Foun- dations and Trends in Machine Learning, 3, 1-122. https://doi.org/10.1561/2200000016
[15]	Suzuki, T. (2013) Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method. Proceedings of the 30th International Conference on Machine Learning, 28, 392-400. https://dl.acm.org/doi/10.5555/3042817.3042863
[16]	Ouyang, H., He, N., Tran, L. and Gray, A.G. (2013) Stochastic Alternating Direction Method of Multipliers. Proceedings of the 30th International Conference on Machine Learning, 28, 80-88. https://dl.acm.org/doi/10.5555/3042817.3042828
[17]	Suzuki, T. (2014) Stochastic Dual Coordinate Ascent with Alternating Direction Method of Multipliers. Proceedings of the 31st International Conference on Machine Learning, 32, 736- 744. https://dl.acm.org/doi/10.5555/3044805.3044889
[18]	Zhong, W.L. and Kwok, J. (2013) Fast Stochastic Alternating Direction Method of Multipliers. Proceedings of the 31st International Conference on Machine Learning, 32, 46-54.
[19]	Zheng, S. and Kwok, J.T.-Y. (2016) Fast-and-Light Stochastic ADMM. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2407-2413. https://dl.acm.org/doi/10.5555/3060832.3060958
[20]	Liu, Y.Y., Shang, F.H. and Cheng, J. (2017) Accelerated Variance Reduced Stochastic ADMM. Proceedings of the 31th AAAI Conference on Artificial Intelligence, San Francisco, California, 2287-2293. https://dl.acm.org/doi/10.5555/3298483.3298569
[21]	[20] Liu, Y.Y., Shang, F.H., Liu, H.Y., Kong, L., Jiao, L.C. and Lin, Z.C. (2020) Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 4242-4255. https://doi.org/10.1109/TPAMI.2020.3000512
[22]	Zhang, X.Q., Burger, M. and Osher, S. (2010) A Unified Primal-Dual Algorithm Framework Based on Bregman Iteration. Journal of Scientific Computing, 46, 20-46. https://doi.org/10.1007/s10915-010-9408-8
[23]	Xiao, L. and Zhang, T. (2014) A Proximal Stochastic Gradient Method with Progressive Variance Reduction. SIAM Journal on Optimization, 24, 2057-2075. https://doi.org/10.1137/140961791

为你推荐

友情链接