基于两部模型的组合惩罚似然估计方法研究及其应用
Research and Application of Combined Penalty Likelihood Estimation Method Based on Two-Part Model
摘要: 在统计学中,多借助零膨胀模型研究零膨胀数据潜在的模型结构及变量选择问题。然而,在多数情况下,响应变量的非零部分为定量数据,简单的零膨胀模型无法刻画这类数据的模型结构,对应的参数估计方法也不再适用。鉴于此,学者提出处理零膨胀半连续数据的两部模型。本文将组合惩罚似然估计方法引入两部模型,研究其变量选择问题。提出一种新的处理高维统计分析问题的惩罚似然估计方法:NCPM (New Combined Punishment Method),并将该方法应用于太原市降水量数据,分析其影响因素。模拟及实例分析结果均表明本文的方法行之有效,较传统的惩罚似然估计方法具有更高的预测精度。
Abstract: In statistics, the potential model structure and variable selection problems of zero expansion data are often studied by means of zero expansion model. However, in most cases, the non-zero part of the response variable is quantitative data. A simple zero expansion model cannot describe the model structure of such data, and the corresponding parameter estimation method is no longer applicable. In view of this, scholars proposed a two-part model to deal with zero expansion semi-continuous data. In this paper, the combined penalty likelihood estimation method is introduced into the two-part model to study the problem of variable selection. A new penalty likelihood estimation method, NCPM (New Combined Punishment Method), is proposed to deal with high-dimensional statistical analysis problems. The method is applied to Taiyuan precipitation data and its influencing factors are analyzed. The results of simulation and case analysis show that the proposed method is effective and has higher prediction accuracy than the traditional penalty likelihood estimation method.
文章引用:张旭宇, 赵丽华. 基于两部模型的组合惩罚似然估计方法研究及其应用[J]. 应用数学进展, 2020, 9(6): 881-891. https://doi.org/10.12677/AAM.2020.96105

参考文献

[1] Manning, W.G., Duan, N. and Rogers, W.H. (1987) Monte-Carlo Evidence on the Choice between Sample Selection and 2-Part Models. Journal of Econometrics, 35, 59-82. [Google Scholar] [CrossRef
[2] McCulloch, C.E. and Searle, S.R. (2001) Generalized, Linear, and Mixed Models. A Wiley-Interscience Publication John Wiley & Sons INC, New York, 23-24. [Google Scholar] [CrossRef
[3] Yan, K.K.W. and Lee, A.H. (2001) Zero-Inflated Poisson Regression with Random Effects to Evaluate an Occupational Injury Prevention Programme. Statistics in Medicine, 20, 2907-2920. [Google Scholar] [CrossRef] [PubMed]
[4] Xu, X. and Ghosh, M. (2015) Bayesian Variable Selection and Estimation for Group Lasso. Bayesian Analysis, 10, 1727-1734. [Google Scholar] [CrossRef
[5] Frank, I. and Friedman, I. (1993) A Statistical View of Some Chemometrics Regression Tools. Technometrics, 35, 109-148. [Google Scholar] [CrossRef
[6] Zhang, C.H. (2010) Nearly Unbiased Variable Selection under Minimax Concave Penalty. The Annals of Statistics, 38, 894-942. [Google Scholar] [CrossRef
[7] Zou, H. and Hastie, T. (2005) Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301-320. [Google Scholar] [CrossRef
[8] Wang, X.M., Park, T. and Carriere, K.C. (2010) Variable Selection via Combined Penalization for High-Dimensional Data Analysis. Computational Statistics and Data Analysis, 54, 2230-2243. [Google Scholar] [CrossRef
[9] Duan, N. and Morris, C.N. (1983) A Comparison of Alternative Models for the Demand for Medical Care. Journal of Business& Economic Statistics, 1, 115-126. [Google Scholar] [CrossRef
[10] Wang, X.M., Park, T. and Carriere, K.C. (2010) Variable Selection via Combined Penalization for High-Dimensional Data Analysis. Computational Statistics and Data Analysis, 54, 2230-2243. [Google Scholar] [CrossRef
[11] Breheny, P. and Huang, J. (2010) Coordinate Descent Algorithms for Nonconvex Penalized Regression, with Applications to Biological Feature Selection. Annals of Applied Statistics, 5, 232-253. [Google Scholar] [CrossRef
[12] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 33-35.
[13] 丁裕国. 降水量分布模式的普适性研究[J]. 1994, 18(5): 552-560.