带有交互项且连接函数未知的广义函数型线性模型研究
Research on the Generalized Functional Linear Model with Interaction Terms and an Unknown Link Function
摘要: 本文提出一种广义部分函数型线性模型。该模型不仅能够同时处理标量型与函数型预测变量,还通过引入未知的连接函数并考虑函数型预测变量间的交互作用,以更贴合实际数据的复杂结构。具体而言,本文首先采用函数型主成分分析对高维函数型数据进行降维。在参数估计上,我们提出一种迭代算法:在假设连接函数已知的条件下,利用极大似然估计法估计回归系数;随后,在回归系数固定的条件下,采用局部线性核回归方法估计连接函数。二者交替迭代,直至收敛,从而同步获得回归系数与连接函数的最终估计。为验证模型的有效性,我们将其应用于肉类蛋白质含量高低的分类研究。实证结果表明,该模型能有效揭示预测变量与响应变量之间的复杂关系。模型的五折交叉验证准确率达到97.03%,性能显著优于连接函数已知的带交互项模型(准确率94.35%)及不带交互项模型(准确率90.45%),展现了其优越的预测能力。
Abstract: In this paper, a generalized partial functional linear model is proposed, which not only accommodates composite predictors including scalar and functional variables, but also incorporates an unknown link function and accounts for the interaction effects among functional predictors, thus achieving a better fit to the actual characteristics of the data. Specifically, functional principal component analysis is first adopted for dimensionality reduction. Under the assumption of a known link function, the maximum likelihood estimation method is employed to estimate the regression coefficients; subsequently, with the estimated regression coefficients fixed, the local linear kernel regression is applied to estimate the unknown link function. This iterative alternating estimation procedure is repeated until convergence, and the final estimates of the regression coefficients and the link function are thus obtained. To verify the good validity and performance of the proposed model, it is applied to the research on the protein content of meat products. The empirical results demonstrate that the model can effectively characterize the relationship between the predictors and the response variable, with an accuracy of 97.03% achieved by five-fold cross-validation. In comparison with the model with a known link function and interaction terms (with an accuracy of 94.35%) and the model with a known link function but without interaction terms (with an accuracy of 90.45%), the proposed model exhibits superior performance and a higher prediction accuracy.
参考文献
|
[1]
|
Nelder, J.A. and Wedderburn, R.W.M. (1972) Generalized Linear Models. Journal of the Royal Statistical Society. Series A (General), 135, 370-384. [Google Scholar] [CrossRef]
|
|
[2]
|
Scallan, A., Gilchrist, R. and Green, M. (1984) Fitting Parametric Link Functions in Generalised Linear Models. Computational Statistics & Data Analysis, 2, 37-49. [Google Scholar] [CrossRef]
|
|
[3]
|
Weisberg, S. and Welsh, A.H. (1994) Adapting for the Missing Link. The Annals of Statistics, 22, 1674-1700. [Google Scholar] [CrossRef]
|
|
[4]
|
Wedderburn, R.W.M. (1974) Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss—Newton Method. Biometrika, 61, 439-447. [Google Scholar] [CrossRef]
|
|
[5]
|
Chiou, J. and Müller, H. (1998) Quasi-Likelihood Regression with Unknown Link and Variance Functions. Journal of the American Statistical Association, 93, 1376-1387. [Google Scholar] [CrossRef]
|
|
[6]
|
Cardot, H., Crambes, C., Kneip, A. and Sarda, P. (2007) Smoothing Splines Estimators in Functional Linear Regression with Errors-In-Variables. Computational Statistics & Data Analysis, 51, 4832-4848. [Google Scholar] [CrossRef]
|
|
[7]
|
Goldsmith, J., Bobb, J., Crainiceanu, C.M., Caffo, B. and Reich, D. (2011) Penalized Functional Regression. Journal of Computational and Graphical Statistics, 20, 830-851. [Google Scholar] [CrossRef] [PubMed]
|
|
[8]
|
Fuchs, K., Scheipl, F. and Greven, S. (2015) Penalized Scalar-On-Functions Regression with Interaction Term. Computational Statistics & Data Analysis, 81, 38-51. [Google Scholar] [CrossRef]
|
|
[9]
|
王艺璇. 广义部分函数型数据分析及其应用[D]: [硕士学位论文]. 北京: 北方工业大学, 2022.
|
|
[10]
|
毛可敬. 带有交互项的广义部分函数型线性模型研究及应用[D]: [硕士学位论文]. 北京: 北方工业大学, 2024.
|