纵向数据下变系数线性模型的惩罚秩回归
Penalized Rank Regression for Varying-Coefficient Linear Models with Longitudinal Data
摘要: 针对纵向数据变系数线性模型中数据具有组内相关性和组间独立性,以及组内协方差矩阵估计效率低的特点,本文基于修正的Cholesky分解和B样条函数处理非参数部分,对组内协方差矩阵提出更有效的估计,改善了不平衡纵向数据的估计效率。同时,文章结合了两个非凸惩罚函数SCAD和MCP实现稳健的变量选择,模拟研究和实证分析表明新方法能够获得更有效的估计,并且相同参数准则下,对高相关性的样本数据,MCP惩罚函数的显著性更强。
Abstract: For longitudinal data in varying coefficient linear models characterized by within-group correlation and between-group independence, as well as the low estimation efficiency of within-group covariance matrices, this paper proposes a more efficient estimation method for the within-group covariance matrix based on modified Cholesky decomposition and B-spline functions to handle the nonparametric components, thereby improving the estimation efficiency for unbalanced longitudinal data. Additionally, the study incorporates two non-convex penalty functions, SCAD and MCP, to achieve robust variable selection. Simulation studies and empirical analyses demonstrate that the proposed method yields more efficient estimates. Under the same parameter criteria, the MCP penalty exhibits stronger significance for highly correlated sample data.
文章引用:张雅斐. 纵向数据下变系数线性模型的惩罚秩回归[J]. 统计学与应用, 2025, 14(7): 334-345. https://doi.org/10.12677/sa.2025.147209

参考文献

[1] Diggle, P., Heagerty, P. and Liang, K.Y. (2002) Analysis of Longitudinal Data. Oxford University Press.
[2] Hsiao, C. (2003) Analysis of Panel Data. Cambridge University Press. [Google Scholar] [CrossRef
[3] Song, X.K. (2007) Correlated Data Analysis: Modeling, Analytics, and Applications. Springer.
[4] Shumway, R.H. (1988) Applied Statistical Time Series Analysis. Prentice Hall.
[5] Cleveland, W.S., Grosse, E. and Shyu, W.M. (2017) Local Regression Models. In: Statistical Models in S, Routledge, 309-376. [Google Scholar] [CrossRef
[6] Hastie, T. and Tibshirani, R. (1993) Varying-Coefficient Models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 55, 757-779. [Google Scholar] [CrossRef
[7] Fan, J. and Zhang, W. (1999) Statistical Estimation in Varying Coefficient Models. The Annals of Statistics, 27, 1491-1518. [Google Scholar] [CrossRef
[8] Chiang, C., Rice, J.A. and Wu, C.O. (2001) Smoothing Spline Estimation for Varying Coefficient Models with Repeatedly Measured Dependent Variables. Journal of the American Statistical Association, 96, 605-619. [Google Scholar] [CrossRef
[9] Jung, S.H. and Ying, Z.L. (2003) Rank-Based Regression with Repeated Measurements Data. Biometrika, 90, 732-740. [Google Scholar] [CrossRef
[10] Wang, Y. and Zhu, M. (2006) Rank-Based Regression for Analysis of Repeated Measures. Biometrika, 93, 459-464. [Google Scholar] [CrossRef
[11] Wang, Y. and Zhao, Y. (2008) Weighted Rank Regression for Clustered Data Analysis. Biometrics, 64, 39-45. [Google Scholar] [CrossRef] [PubMed]
[12] Fu, L. and Wang, Y. (2016) Variable Selection in Rank Regression for Analyzing Longitudinal Data. Statistical Methods in Medical Research, 27, 2447-2458. [Google Scholar] [CrossRef] [PubMed]
[13] Fu, L., Yang, Z., Cai, F. and Wang, Y. (2020) Efficient and Doubly-Robust Methods for Variable Selection and Parameter Estimation in Longitudinal Data Analysis. Computational Statistics, 36, 781-804. [Google Scholar] [CrossRef
[14] Fu, L. and Wang, Y. (2012) Efficient Estimation for Rank‐Based Regression with Clustered Data. Biometrics, 68, 1074-1082. [Google Scholar] [CrossRef] [PubMed]
[15] 吕晶, 郭朝会, 杨虎, 等. 纵向数据的有效秩推断基于修正的Cholesky分解[J]. 数学学报(中文版), 2018, 61(4): 549-568.
[16] Wang, L., Li, H. and Huang, J.Z. (2008) Variable Selection in Nonparametric Varying-Coefficient Models for Analysis of Repeated Measurements. Journal of the American Statistical Association, 103, 1556-1569. [Google Scholar] [CrossRef] [PubMed]
[17] Zhang, C. (2010) Nearly Unbiased Variable Selection under Minimax Concave Penalty. The Annals of Statistics, 38, 894-942. [Google Scholar] [CrossRef
[18] Brown, B.M. and Wang, Y. (2005) Standard Errors and Covariance Matrices for Smoothed Rank Estimators. Biometrika, 92, 149-158. [Google Scholar] [CrossRef
[19] Yao, W. and Li, R. (2012) New Local Estimation Procedure for a Non-Parametric Regression Function for Longitudinal Data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 75, 123-138. [Google Scholar] [CrossRef] [PubMed]
[20] Liu, S. and Li, G. (2015) Varying-Coefficient Mean-Covariance Regression Analysis for Longitudinal Data. Journal of Statistical Planning and Inference, 160, 89-106. [Google Scholar] [CrossRef
[21] Fan, J. and Yao, Q. (1998) Efficient Estimation of Conditional Variance Functions in Stochastic Regression. Biometrika, 85, 645-660. [Google Scholar] [CrossRef
[22] Fan, J. and Li, R. (2001) Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties. Journal of the American Statistical Association, 96, 1348-1360. [Google Scholar] [CrossRef
[23] Zhang, D., Lin, X., Raz, J. and Sowers, M. (1998) Semiparametric Stochastic Mixed Models for Longitudinal Data. Journal of the American Statistical Association, 93, 710-719. [Google Scholar] [CrossRef
[24] Fan, Y., Qin, G. and Zhu, Z. (2012) Variable Selection in Robust Regression Models for Longitudinal Data. Journal of Multivariate Analysis, 109, 156-167. [Google Scholar] [CrossRef