基于Geman-McClure损失的稳健变量选择
Robust Variable Selection Based on Geman-McClure Loss
DOI: 10.12677/SA.2023.122042, PDF,    科研立项经费支持
作者: 谢传琪, 王延新*:宁波工程学院理学院,浙江 宁波
关键词: 稳健变量选择高维数据Adaptive LASSOGeman-McClure损失Robust Variable Selection High-Dimensional Data Adaptive LASSO Geman-McClure Loss
摘要: 基于惩罚函数的最小二乘估计或似然估计是变量选择的有效方法。但当数据存在异常值时,罚最小二乘或似然估计的稳健性受到极大挑战。本文提出基于Geman-McClure损失的稳健变量选择方法,该损失函数能够有效抵制数据中异常值的影响。数值模拟和实际数据分析验证了该模型的有效性和稳健性。
Abstract: Least squares or likelihood estimation based on penalty function is an effective method for variable selection. However, the robustness of penalized least squares or likelihood estimation is greatly challenged when there are outliers in the data. In this paper, we propose a robust variable selection method based on the Geman-McClure loss, which is an effective loss function to counteract the influence of outliers in the data. Numerical simulations and analysis of real data validate the validity and robustness of the model.
文章引用:谢传琪, 王延新. 基于Geman-McClure损失的稳健变量选择[J]. 统计学与应用, 2023, 12(2): 391-397. https://doi.org/10.12677/SA.2023.122042

参考文献

[1] Akaike, H. (1973) Information Theory and an Extension of the Maximum Likelihood Principle. In: Petrov, B.N. and Csaki, F., Eds., Second International Symposium on Information Theory, AkademiaiKiado, Budapest, 267-281.
[2] Schwarz, G. (1978) Estimating the Dimension of a Model. The Annals of Statistics, 6, 461-464.
[Google Scholar] [CrossRef
[3] Tibshirani, R. (1996) Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288.
[Google Scholar] [CrossRef
[4] Zhao, P. and Yu, B. (2006) On Model Selection Consistency of Lasso. The Journal of Machine Learning Research, 7, 2541-2563.
[5] Zhang, C.-H. and Huang, J. (2008) The Sparsity and Bias of the Lasso Selection in High-Dimensional Linear Regression. The Annals of Statistics, 36, 1567-1594.
[Google Scholar] [CrossRef
[6] Fan, J. and Li, R. (2001) Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. Journal of the American Statistical Association, 96, 1348-1360.
[Google Scholar] [CrossRef
[7] Zou, H. (2006) The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101, 1418-1429.
[Google Scholar] [CrossRef
[8] Zhang, T. (2008) Multi-Stage Convex Relaxation for Learning with Sparse Regularization. Proceedings of the 21st International Conference on Neural Information Processing Systems, Vancouver, 8-10 December 2008, 1929-1936.
[9] Zhang, C.-H. (2010) Nearly Unbiased Variable Selection under Minimax Concave Penalty. The Annals of Statistics, 38, 894-942.
[Google Scholar] [CrossRef
[10] Xu, X. (2010) Data Modeling: Visual Psychology Approach and L1/2 Regularization Theory. Proceedings of the International Congress of Mathematicians 2010 (ICM 2010), Hyderabad, 19-27 August 2010.
[11] Fan, J., Li, Q. and Wang, Y. (2017) Estimation of High Dimensional Mean Regression in the Absence of Symmetry and Light Tail Assumptions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 79, 247-265.
[Google Scholar] [CrossRef] [PubMed]
[12] Wang, H., Li, G. and Jiang, G. (2007) Robust Regression Shrinkage and Consistent Variable Selection through the LAD-Lasso. Journal of Business & Economic Statistics, 25, 347-355.
[Google Scholar] [CrossRef
[13] Avella-Medina, M. and Ronchetti, E. (2018) Robust and Consistent Variable Selection in High-Dimensional Generalized Linear Models. Biometrika, 105, 31-44.
[Google Scholar] [CrossRef
[14] Prasad, A., Suggala, A.S., Balakrishnan, S. and Ravikumar, P. (2020) Robust Estimation via Robust Gradient Estimation. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82, 601-627.
[Google Scholar] [CrossRef
[15] Lozano, A.C., Meinshausen, N. and Yang, E. (2016) Minimum Distance Lasso for Robust High-Dimensional Regression. Electronic Journal of Statistics, 10, 1296-1340.
[Google Scholar] [CrossRef
[16] 钟先乐, 樊亚莉, 张探探. 基于t函数的稳健变量选择方法[J]. 上海理工大学学报, 2017, 39(6): 542-548.
[17] Wang, L., Peng, B., Bradic, J., Li, R. and Wu, Y. (2020) A Tuning-Free Robust and Efficient Approach to High- Dimensional Regression. Journal of the American Statistical Association, 115, 1700-1714.
[Google Scholar] [CrossRef
[18] Wang, X., Jiang, Y., Huang, M. and Zhang, H. (2013) Robust Variable Selection with Exponential Squared Loss. Journal of the American Statistical Association, 108, 632-643.
[Google Scholar] [CrossRef] [PubMed]
[19] 陈子亮, 卿清. 影响波士顿不同社区房价水平的因素分析——基于分位数回归方法[J]. 商, 2015(30): 278-279.
[20] 陈泽坤, 程晓荣. 基于梯度下降算法的房价回归分析与预测[J]. 信息技术与信息化, 2020(5): 10-13.
[21] Guo, H., Wu, C.J. and Yu, Y. (2015) Time-Varying Beta and the Value Premium: A Single-Index Varying-Coefficient Model Approach.
[Google Scholar] [CrossRef
[22] Qian, J. and Su, L. (2016) Shrinkage Estimation of Regression Models with Multiple Structural Changes. Econometric Theory, 32, 1376-1433.
[Google Scholar] [CrossRef