基于面板数据的分组多折点回归模型估计
Estimation of Grouped Multi-Kink Regression Model Based on Panel Data
摘要: 折点回归模型是指响应变量与某个协变量之间存在连续的分段线性关系,本文基于面板数据,研究了个体间具有群组结构的多折点回归模型。首先,建立一种基于贪心策略的坐标下降法用于预估折点位置,用较小的计算代价解决了折点估计量对初值敏感的问题,并使用信息准则选择合适的折点个数。然后,基于该折点预估算法的框架下,使用最大最小距离法选择初始聚类中心,用于K-means类型的算法去优化各组的模型参数,分组的个数由自动化手肘法确定。数值模拟和实证分析显示,该方法可得到良好的参数估计和群组结构估计,并且在真实的女性黄体酮数据中具有实际意义。
Abstract: A kink regression model refers to a model where the response variable has a continuous piecewise linear relationship with a covariate. This paper studies a multi-kink regression model with grouped structure among individuals based on panel data. First, a coordinate descent method based on a greedy strategy is established to estimate the kink locations, addressing the issue of sensitivity to initial values in kink estimation with minimal computational cost. An information criterion is used to select the appropriate number of kinks. Then, within the framework of this kink estimation algorithm, the max-min distance method is used to select the initial clustering centers for a K-means type algorithm to optimize the model parameters for each group. The number of groups is determined using an automated elbow method. Numerical simulations and empirical analysis show that this method can achieve good parameter estimation and grouped structure estimation. Moreover, the grouped structure and within-group parameters have analytical value in the real-world data of female progesterone levels.
文章引用:王昊. 基于面板数据的分组多折点回归模型估计[J]. 应用数学进展, 2024, 13(8): 4021-4033. https://doi.org/10.12677/aam.2024.138383

参考文献

[1] Lerman, P.M. (1980) Fitting Segmented Regression Models by Grid Search. Journal of the Royal Statistical Society. Series C, 29, 77-84. [Google Scholar] [CrossRef
[2] Hinkley, D., Chapman, P. and Runger, G. (1980) Change-Point Problems. Institute of Mathematical Statistics.
[3] Chappell, R. (1989) Fitting Bent Lines to Data, with Applications to Allometry. Journal of Theoretical Biology, 138, 235-256. [Google Scholar] [CrossRef] [PubMed]
[4] Fong, Y., Di, C., Huang, Y. and Gilbert, P.B. (2016) Model-Robust Inference for Continuous Threshold Regression Models. Biometrics, 73, 452-462. [Google Scholar] [CrossRef] [PubMed]
[5] Hansen, B.E. (2017) Regression Kink with an Unknown Threshold. Journal of Business & Economic Statistics, 35, 228-240. [Google Scholar] [CrossRef
[6] Li, C., Wei, Y., Chappell, R. and He, X. (2010) Bent Line Quantile Regression with Application to an Allometric Study of Land Mammals’ Speed and Mass. Biometrics, 67, 242-249. [Google Scholar] [CrossRef] [PubMed]
[7] Zhang, F. and Li, Q. (2017) Robust Bent Line Regression. Journal of Statistical Planning and Inference, 185, 41-55. [Google Scholar] [CrossRef] [PubMed]
[8] Zhong, W., Wan, C. and Zhang, W. (2021) Estimation and Inference for Multi-Kink Quantile Regression. Journal of Business & Economic Statistics, 40, 1123-1139. [Google Scholar] [CrossRef
[9] Fong, Y. (2019) Fast Bootstrap Confidence Intervals for Continuous Threshold Linear Regression. Journal of Computational and Graphical Statistics, 28, 466-470. [Google Scholar] [CrossRef] [PubMed]
[10] Muggeo, V.M.R. (2003) Estimating Regression Models with Unknown Break‐Points. Statistics in Medicine, 22, 3055-3071. [Google Scholar] [CrossRef] [PubMed]
[11] Gössl, C. and Küchenhoff, H. (2001) Bayesian Analysis of Logistic Regression with an Unknown Change Point and Covariate Measurement Error. Statistics in Medicine, 20, 3109-3121. [Google Scholar] [CrossRef] [PubMed]
[12] Li, Y., Hu, Z., Liu, J. and Deng, J. (2021) A Note on Regression Kink Model. Communications in StatisticsTheory and Methods, 51, 8246-8263. [Google Scholar] [CrossRef
[13] Yang, L., Zhang, C., Lee, C. and Chen, I. (2020) Panel Kink Threshold Regression Model with a Covariate-Dependent Threshold. The Econometrics Journal, 24, 462-481. [Google Scholar] [CrossRef
[14] Zhou, M., Ye, F., Li, Y., Liu, F. and Wan, C. (2024) A Note on the Covariate-Dependent Kink Threshold Regression Model for Panel Data. Communications in StatisticsTheory and Methods. [Google Scholar] [CrossRef
[15] Wan, C., Zhong, W., Zhang, W. and Zou, C. (2022) Multikink Quantile Regression for Longitudinal Data with Application to Progesterone Data Analysis. Biometrics, 79, 747-760. [Google Scholar] [CrossRef] [PubMed]
[16] Du, L., Koscik, R. L., Betthauser, T. J., Johnson, S. C., Larget, B. and Chappell, R. (2022) Bayesian Bent-Line Regression Model for Longitudinal Data with an Application to the Study of Cognitive Performance Trajectories in Wisconsin Registry for Alzheimer’s Prevention. arXiv: 2211.09915. [Google Scholar] [CrossRef
[17] Sun, Y., Wan, C., Zhang, W. and Zhong, W. (2024) A Multi-Kink Quantile Regression Model with Common Structure for Panel Data Analysis. Journal of Econometrics, 239, Article ID: 105304. [Google Scholar] [CrossRef
[18] Munro, C.J., Stabenfeldt, G.H., Cragun, J.R., Addiego, L.A., Overstreet, J.W. and Lasley, B.L. (1991) Relationship of Serum Estradiol and Progesterone Concentrations to the Excretion Profiles of Their Major Urinary Metabolites as Measured by Enzyme Immunoassay and Radioimmunoassay. Clinical Chemistry, 37, 838-844. [Google Scholar] [CrossRef