基于非凸惩罚回归的参数估计和异常值检测
Parameter Estimation and Outliers Detection Based on Nonconvex Penalized Regression
摘要: 本文基于最优化理论提出了一种实现多元线性回归模型的参数估计和异常值检测的方法。首先,建立了基于Huber损失函数和l0惩罚项的回归模型,为便于求解进一步将该模型中l0惩罚项松弛为Capped-l1惩罚;其次,刻画了松弛问题的方向稳定点,并建立了原问题和松弛问题的等价性。 最后提出了松弛问题的光滑化模型,并证明了光滑模型与松弛问题稳定点的一致性。
Abstract: This paper presents a method to implement parameter estimation and outliers de- tection for multiple linear regression models based on optimization theory. First, a regression model based on the Huber loss function and the l0 penalty term is devel- oped, and the l0 penalty term in this model is further relaxed to the Capped-l1 penalty to facilitate the solution; and second, the directional stability point of the relaxation problem is characterized, and the equivalence of the original and relaxation problems is established. Finally, we propose a smooth model for the relaxation problem and prove the consistency of the stable point of smooth model and the relaxation problem.
文章引用:张尊皓, 彭定涛, 苏妍妍. 基于非凸惩罚回归的参数估计和异常值检测[J]. 应用数学进展, 2022, 11(12): 9081-9095. https://doi.org/10.12677/AAM.2022.1112958

参考文献

[1] Papageorgiou, G., Bouboulis, P. and Theodoridis, S. (2015) Robust Linear Regression Analysis—A Greedy Approach. IEEE Transactions on Signal Processing, 63, 3872-3887.
https://doi.org/10.1109/TSP.2015.2430840
[2] Huber, P. (1972) The 1972 Wald Lecture Robust Statistics: A Review. Annals of Mathematical Statistics, 43, 1041-1067.
https://doi.org/10.1214/aoms/1177692459
[3] Rousseeuw, P. and Leroy, A. (1987) Robust Regression and Outlier Detection. Wiley, New York, NY.
https://doi.org/10.1002/0471725382
[4] Maronna, R., Martin, R. and Yohai, V. (2006) Robust Statistics: Theory and Methods. Wiley, New York, NY.
[5] Huber, P. (1981) Robust Statistics. Wiley, New York, NY.
[6] Cook, R. and Weisberg, S. (1982) Residuals and Influence in Regression. Chapman and Hall, New York, NY.
[7] Natarajan, B. (1995) Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24, 227-234.
https://doi.org/10.1137/S0097539792240406
[8] Nguyen, N. and Tran, T. (2013) Robust Lasso with Missing and Grossly Corrupted Observa- tions. IEEE Transactions on Information Theory, 59, 2036-2058.
https://doi.org/10.1109/TIT.2012.2232347
[9] Chen, J. and Liu, Y. (2019) Stable Recovery of Structured Signals from Corrupted Subgaussian Measurements. IEEE Transactions on Information Theory, 65, 2976- 2994.
https://doi.org/10.1109/TIT.2018.2890194
[10] Katayama, S. and Fujisawa, H. (2017) Sparse and Robust Linear Regression: An Optimization Algorithm and Its Statistical Properties. Statistica Sinica, 27, 1243-1264.
https://doi.org/10.5705/ss.202015.0179
[11] Fan, J. and Li, Y. (2001) Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties. Journal of the American Statistical Association, 96, 1348-1360.
https://doi.org/10.1198/016214501753382273
[12] Ong, C. and An, L. (2013) Learning Sparse Classifiers with Difference of Convex Functions Algorithms. Optimization Methods and Software, 28, 830-854.
https://doi.org/10.1080/10556788.2011.652630
[13] Peleg, D. and Meir, R. (2008) A Bilinear Formulation for Vector Sparsity Optimization. Signal Processing, 88, 375-389.
https://doi.org/10.1016/j.sigpro.2007.08.015
[14] Zhang, C. (2010) Nearly Unbiased Variable Selection under Minimax Concave Penalty. Annals of Statistics, 38, 894-942.
https://doi.org/10.1214/09-AOS729
[15] Zhang, T. (2010) Analysis of Multi-Stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081-1107.
[16] Cand´es, E., Romberg, J. and Tao, T. (2006) Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information. IEEE Transactions on Infor- mation Theory, 52, 489-509.
https://doi.org/10.1109/TIT.2005.862083
[17] Huber, P. (1964) Robust Estimation of a Location Parameter. Annals of Mathematical Statis- tics, 35, 73-101.
https://doi.org/10.1214/aoms/1177703732
[18] Fan, J., Li, Q. and Wang, Y. (2017) Estimation of High Dimensional Mean Regression in the Absence of Symmetry and Light Tail Assumptions. Journal of Royal Statistical Society, Series B, 79, 247-265.
https://doi.org/10.1111/rssb.12166
[19] Yi, C. and Huang, J. (2017) Semismooth Newton Coordinate Descent Algorithm for Elastic- Net Penalized Huber Loss Regression and Quantile Regression. Journal of Computational and Graphical Statistics, 26, 547-557.
https://doi.org/10.1080/10618600.2016.1256816
[20] [20] Sun, Q., Zhou, W. and Fan, J. (2020) Adaptive Huber Regression. Journal of the American Statistical Association, 115, 254-265.
https://doi.org/10.1080/01621459.2018.1543124
[21] Peng, D. and Chen, X. (2020) Computation of Second-Order Directional Stationary Points for Group Sparse Optimization. Optimization Methods and Software, 35, 348-376.
https://doi.org/10.1080/10556788.2019.1684492
[22] Zhang, X. and Peng, D. (2022) Solving Constrained Nonsmooth Group Sparse Optimization via Group Capped-L1 Relaxation and Group Smoothing Proximal Gradient Algorithm. Com- putational Optimization and Applications, 83, 801-844.
https://doi.org/10.1007/s10589-022-00419-2
[23] 彭定涛, 唐琦, 张弦. 组稀疏优化问题精确连续Capped-L1松弛[J]. 数学学报, 2022, 65(2): 243-262.
[24] 罗孝敏, 彭定涛, 张弦. 基于MCP正则的最小一乘回归问题研究[J]. 系统科学与数学, 2021, 41(8): 2327-2337.
[25] Ahn, M., Pang, J. and Jack, X. (2017) Difference-of-Convex Learning: Directional Stationarity, Optimality, and Sparsity. SIAM Journal on Optimization, 27, 1637-1665.
https://doi.org/10.1137/16M1084754
[26] Chen, X., Niu, L. and Yuan, Y. (2013) Optimality Conditions and a Smoothing Trust Region Newton Method for Non-Lipschitz Optimization. SIAM Journal on Optimization, 23, 1528- 1552.
https://doi.org/10.1137/120871390