一类带偶次惩罚范数的非凸函数及周期ADMM算法的收敛性分析

doi:10.12677/aam.2024.136252

期刊菜单

一类带偶次惩罚范数的非凸函数及周期ADMM算法的收敛性分析
Convergence Analysis of a Class of Nonconvex Functions with Even-Powered Penalty Norms and the Periodic ADMM Algorithm

DOI: 10.12677/aam.2024.136252, PDF,
作者: 宋政纲：兰州理工大学理学院，甘肃兰州
关键词: 机器学习；非凸函数；Lp范数；交替方向乘子；Machine Learning； Nonconvex Function； Lp Norm； Alternating Direction Multiplier

摘要: 在机器学习以及其它相关领域中，针对非凸函数的优化问题，目前存在的算法理论上对非凸函数的收敛和全局稳定性无法得到有效保证。本文提出将Lp范数(p为偶数)引入到非凸函数中，并在此基础上设计一种周期交替方向乘子(Periodic Alternating Direction Method of Multipliers, PADMM)的优化算法，用于此类非凸函数收敛性分析。我们证明在惩罚参数足够大的情况下，带偶次惩罚范数的非凸函数必收敛，并且收敛到全局最小值。此外，PADMM算法不对变量更新的先后顺序作特殊要求，这一特性大大增强了PADMM算法在处理各类非凸函数优化问题时的普适性。

Abstract: In machine learning and other related fields, for the optimization problem of non-convex functions, the existing algorithms cannot effectively guarantee the convergence and global stability of non-convex functions in theory. In this paper, the Lp norm (p is even) is introduced into the non-convex function, and on this basis, an optimization algorithm of Periodic Alternating Direction Method of Multipliers (PADMM) is designed for the convergence analysis of such non-convex functions. We prove that when the penalty parameter is large enough, the nonconvex function with even penalty norm will converge and converge to the global minimum. In addition, the PADMM algorithm does not impose special requirements on the order of variable updating, which greatly enhances the universality of the PADMM algorithm in dealing with various non-convex function optimization problems.

文章引用：宋政纲. 一类带偶次惩罚范数的非凸函数及周期ADMM算法的收敛性分析[J]. 应用数学进展, 2024, 13(6): 2641-2652. https://doi.org/10.12677/aam.2024.136252

参考文献

[1]	Jain, P. and Kar, P. (2017) Non-Convex Optimization for Machine Learning. Foundations and Trends® in Machine Learning, 10, 142-336. [Google Scholar] [CrossRef]
[2]	Du, S., Lee, J., Li, H., et al. (2019) Gradient Descent Finds Global Minima of Deep Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, 28 May 2019, 1675-1685.
[3]	Mignacco, F. and Urbani, P. (2022) The Effective Noise of Stochastic Gradient Descent. Journal of Statistical Mechanics: Theory and Experiment, 2022, Article 083405. [Google Scholar] [CrossRef]
[4]	Huang, F., Gao, S., Pei, J., et al. (2022) Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization. Journal of Machine Learning Research, 23, 1616-1685.
[5]	Shani, L., Efroni, Y. and Mannor, S. (2020) Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 5668-5675. [Google Scholar] [CrossRef]
[6]	Krutikov, V., Tovbis, E., Stanimirović, P. and Kazakovtsev, L. (2023) On the Convergence Rate of Quasi-Newton Methods on Strongly Convex Functions with Lipschitz Gradient. Mathematics, 11, Article 4715. [Google Scholar] [CrossRef]
[7]	Glowinski, R. and Marroco, A. (1975) Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-Dualité d’une classe de problèmes de Dirichlet non linéaires. Revue française d’automatique, informatique, recherche opérationnelle. Analyse Numérique, 9, 41-76. [Google Scholar] [CrossRef]
[8]	Gabay, D. and Mercier, B. (1976) A Dual Algorithm for the Solution of Nonlinear Variational Problems via Finite Element Approximation. Computers & Mathematics with Applications, 2, 17-40. [Google Scholar] [CrossRef]
[9]	Bertsekas, D.P. (2014) Constrained Optimization and Lagrange Multiplier Methods. Academic Press.
[10]	Jakovetic, D., Bajovic, D., Xavier, J. and Moura, J.M.F. (2020) Primal-Dual Methods for Large-Scale and Distributed Convex Optimization and Data Analytics. Proceedings of the IEEE, 108, 1923-1938. [Google Scholar] [CrossRef]
[11]	Ma, S. (2015) Alternating Proximal Gradient Method for Convex Minimization. Journal of Scientific Computing, 68, 546-572. [Google Scholar] [CrossRef]
[12]	Chen, C., He, B., Ye, Y. and Yuan, X. (2014) The Direct Extension of ADMM for Multi-Block Convex Minimization Problems Is Not Necessarily Convergent. Mathematical Programming, 155, 57-79. [Google Scholar] [CrossRef]
[13]	Lin, T., Ma, S. and Zhang, S. (2017) Global Convergence of Unmodified 3-Block ADMM for a Class of Convex Minimization Problems. Journal of Scientific Computing, 76, 69-88. [Google Scholar] [CrossRef]
[14]	Wang, Y., Yin, W. and Zeng, J. (2018) Global Convergence of ADMM in Nonconvex Nonsmooth Optimization. Journal of Scientific Computing, 78, 29-63. [Google Scholar] [CrossRef]
[15]	Chao, M.T., Zhang, Y. and Jian, J.B. (2020) An Inertial Proximal Alternating Direction Method of Multipliers for Nonconvex Optimization. International Journal of Computer Mathematics, 98, 1199-1217. [Google Scholar] [CrossRef]
[16]	Liavas, A.P. and Sidiropoulos, N.D. (2015) Parallel Algorithms for Constrained Tensor Factorization via Alternating Direction Method of Multipliers. IEEE Transactions on Signal Processing, 63, 5450-5463. [Google Scholar] [CrossRef]
[17]	Shen, Y., Wen, Z. and Zhang, Y. (2012) Augmented Lagrangian Alternating Direction Method for Matrix Separation Based on Low-Rank Factorization. Optimization Methods and Software, 29, 239-263. [Google Scholar] [CrossRef]
[18]	Mai, V. and Johansson, M. (2020) Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization. Proceedings of the 37th International Conference on Machine Learning, Online, 13-18 July 2020, 6630-6639.
[19]	Emmert-Streib, F. and Dehmer, M. (2019) High-Dimensional Lasso-Based Computational Regression Models: Regularization, Shrinkage, and Selection. Machine Learning and Knowledge Extraction, 1, 359-383. [Google Scholar] [CrossRef]
[20]	Avron, H., Clarkson, K.L. and Woodruff, D.P. (2017) Faster Kernel Ridge Regression Using Sketching and Preconditioning. SIAM Journal on Matrix Analysis and Applications, 38, 1116-1138. [Google Scholar] [CrossRef]
[21]	Zhong, W. and Kwok, J. (2014) Gradient Descent with Proximal Average for Nonconvex and Composite Regularization. Proceedings of the AAAI Conference on Artificial Intelligence, 28, 2206-2212. [Google Scholar] [CrossRef]
[22]	Li, X., Ding, S. and Li, Y. (2017) Outlier Suppression via Non-Convex Robust PCA for Efficient Localization in Wireless Sensor Networks. IEEE Sensors Journal, 17, 7053-7063. [Google Scholar] [CrossRef]

为你推荐

友情链接