均值模型中基于稳健变量选择的多变点估计
Multiple Change-Point Estimation in Mean Models Based on Robust Variable Selection
DOI: 10.12677/pm.2026.165145, PDF,    科研立项经费支持
作者: 葛明霞, 董翠玲*:新疆师范大学数学科学学院,新疆 乌鲁木齐
关键词: 变量选择指数平方损失函数均值变点模型异常值稳健估计Variable Selection Exponential Squared Loss Mean Change-Point Model Outliers Robust Estimation
摘要: 基于变量选择的多变点估计是目前数据研究的热点问题,而数据中异常值的存在往往会严重干扰变点估计的精度与稳定性,基于稳健变量选择的多变点估计成为该领域亟待解决的问题。文章提出一种带有指数平方损失函数(Exponential Squared Loss, ESL)的两阶段多变点估计方法,结合调节参数对惩罚力度的灵活控制,实现了对数据中异常值的有效抵御,具备良好的稳健性。数值模拟结果表明,在含有1%、5%、10%异常值的场景下,基于指数平方损失的两阶段多变点估计方法均优于累积和方法和基于平方损失函数的两阶段多变点估计方法。应用R语言自带的测井数据进行实证分析,验证了该方法的有效性。该方法为含有异常值数据的稳健变点估计提供了有效的新思路,可为相关实际应用提供重要的方法支撑与参考。
Abstract: Multiple change-point estimation based on variable selection is a hot topic in current data research. However, the presence of outliers in data often severely interferes with the accuracy and stability of change-point estimation. Therefore, multiple change-point estimation based on robust variable selection has become an urgent problem to be solved in this field. This paper proposes a two-stage multiple change-point estimation method with the Exponential Squared Loss (ESL) function. By flexibly controlling the penalty intensity through tuning parameters, this method effectively resists outliers in the data and exhibits excellent robustness. Numerical simulation results show that the proposed two-stage method based on exponential squared loss outperforms both the cumulative sum (CUSUM) method and the two-stage multiple change-point estimation method based on the squared loss function in scenarios with 1%, 5%, and 10% outliers. Empirical analysis is conducted using the well-log data built into R, which verifies the effectiveness of the method. This method provides a new effective idea for robust change-point estimation of data containing outliers, and can offer important methodological support and reference for relevant practical applications.
文章引用:葛明霞, 董翠玲. 均值模型中基于稳健变量选择的多变点估计[J]. 理论数学, 2026, 16(5): 215-226. https://doi.org/10.12677/pm.2026.165145

参考文献

[1] 李美琪, 金百锁, 董翠玲. 线性回归模型中相依数据的多结构变点的估计[J]. 中国科学: 数学, 2023, 53(7): 1007-1024.
[2] Pepelyshev, A. and Polunchenko, A.S. (2017) Real-Time Financial Surveillance via Quickest Change-Point Detection Methods. Statistics and Its Interface, 10, 93-106. [Google Scholar] [CrossRef
[3] Chen, J. and Gupta, A.K. (2014) Parametric Statistical Change Point Analysis: With Applications to Genetics, Medicine, and Finance. Birkhäuser.
[4] Ghosh, P. and Vaida, F. (2010) Random Changepoint Modelling of HIV Immunologic Responses. Statistics in Medicine, 26, 2074-2087. [Google Scholar] [CrossRef] [PubMed]
[5] Goncalves, A.M. (2013) Change-Point Analysis in Environmental Time Series. Repositório Institucional da Universidade de Aveiro, 1-12.
[6] Kim, J. and Cheon, S. (2009) Multiple Change-Point Estimation of Air Pollution Mean Vectors. Korean Journal of Applied Statistics, 22, 687-695. [Google Scholar] [CrossRef
[7] PAGE, E.S. (1954) Continuous Inspection Schemes. Biometrika, 41, 100-115. [Google Scholar] [CrossRef
[8] Hinkley, D.V. (1971) Inference about the Change-Point from Cumulative Sum Tests. Biometrika, 58, 509-523. [Google Scholar] [CrossRef
[9] Csörgö, M. and Horváth, L. (1997) Limit Theorems in Change-Point Analysis. Wiley.
[10] Wu, Y. (2005) Inference for Change Point and Post Change Means after a CUSUM Test. Springer.
[11] Brodsky, B.E. and Darkhovsky, B.S. (1993) Nonparametric Methods in Change-Point Problems. Kluwer Academic Publishers.
[12] Harchaoui, Z. and Lévy-Leduc, C. (2010) Multiple Change-Point Estimation with a Total Variation Penalty. Journal of the American Statistical Association, 105, 1480-1493. [Google Scholar] [CrossRef
[13] Jin, B., Wu, Y. and Shi, X. (2016) Consistent Two‐Stage Multiple Change‐Point Detection in Linear Models. Canadian Journal of Statistics, 44, 161-179. [Google Scholar] [CrossRef
[14] 邹航, 姜云卢. 高维线性回归模型稳健变量选择方法综述[J]. 应用概率统计, 2024, 40(1): 157-181.
[15] Huber, P.J. (1981) Robust Statistics. Wiley. [Google Scholar] [CrossRef
[16] Yohai, V.J. (1987) High Breakdown-Point and High Efficiency Robust Estimates for Regression. The Annals of Statistics, 15, 642-656. [Google Scholar] [CrossRef
[17] Yohai, V.J. and Zamar, R.H. (1988) High Breakdown-Point Estimates of Regression by Means of the Minimization of an Efficient Scale. Journal of the American Statistical Association, 83, 406-413. [Google Scholar] [CrossRef
[18] Cuzick, J. (1988) Rank Regression. The Annals of Statistics, 16, 1369-1389. [Google Scholar] [CrossRef
[19] Rousseeuw, P.J. (1984) Least Median of Squares Regression. Journal of the American Statistical Association, 79, 871-880. [Google Scholar] [CrossRef
[20] Rousseeuw, P.J. and Leroy, A.M. (1987) Robust Regression and Outlier Detection. Wiley. [Google Scholar] [CrossRef
[21] Rousseeuw, P. and Yohai, V. (1984) Robust Regression by Means of S-Estimators. In: Franke, J., Härdle, W. and Martin, D., Eds., Lecture Notes in Statistics, Springer US, 256-272. [Google Scholar] [CrossRef
[22] Fan, J. and Li, R. (2001) Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties. Journal of the American Statistical Association, 96, 1348-1360. [Google Scholar] [CrossRef
[23] Wang, X., Jiang, Y., Huang, M. and Zhang, H. (2013) Robust Variable Selection with Exponential Squared Loss. Journal of the American Statistical Association, 108, 632-643. [Google Scholar] [CrossRef] [PubMed]
[24] Ruanaidh, J.J.K. and Fitzgerald, W.J. (1996) Numerical Bayesian Methods Applied to Signal Processing. Springer.
[25] Fearnhead, P. and Clifford, P. (2003) On-Line Inference for Hidden Markov Models via Particle Filters. Journal of the Royal Statistical Society Series B: Statistical Methodology, 65, 887-899. [Google Scholar] [CrossRef