基于Poisson回归模型因果效应的目标最大似然估计
Targeted Maximum Likelihood Estimation for Causal Effects Based on Poisson Regression Models
摘要: 观察性研究中,估计二值处理的平均因果效应(ATE)时,传统回归方法易受模型误设影响。目标最大似然估计(TMLE)是一种半参数双稳健方法,仅需结果模型或倾向得分模型之一正确即可获得一致估计。本文以Poisson计数结果为背景,系统介绍TMLE的原理与算法,并通过蒙特卡罗模拟比较TMLE与其余方法在模型正确、倾向得分误设、结果模型误设三种场景下的表现。模拟结果表明,TMLE在所有场景下均保持低偏差和较小的均方根误差,表现出双稳健性。实例分析进一步验证了TMLE在真实计数数据中的实用性。TMLE是估计Poisson型ATE的可靠方法,建议作为观察性研究中计数结局因果推断的首选工具之一。
Abstract: In observational studies, conventional regression methods for estimating the average treatment effect (ATE) of a binary treatment are vulnerable to model misspecification. Targeted maximum likelihood estimation (TMLE) is a semiparametric doubly robust method that requires only one of the outcome model or the propensity score model to be correctly specified to obtain a consistent estimate. Focusing on Poisson count outcomes, this paper systematically introduces the principles and algorithm of TMLE, and compares TMLE with other methods via Monte Carlo simulations under three scenarios: correct model specification, misspecified propensity score, and misspecified outcome model. Simulation results show that TMLE maintains low bias and small root mean squared error across all scenarios, demonstrating double robustness. An empirical example further validates its practical utility with real count data. TMLE is a reliable method for estimating Poisson-type ATE and is recommended as a preferred tool for causal inference with count outcomes in observational studies.
参考文献
|
[1]
|
Rubin, D.B. (1974) Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology, 66, 688-701. [Google Scholar] [CrossRef]
|
|
[2]
|
Hernán, M.A. and Robins, J.M. (2020) Causal Inference: What If. Chapman & Hall/CRC.
|
|
[3]
|
Rosenbaum, P.R. and Rubin, D.B. (1983) The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, 70, 41-55. [Google Scholar] [CrossRef]
|
|
[4]
|
Robins, J. (1986) A New Approach to Causal Inference in Mortality Studies with a Sustained Exposure Period—Application to Control of the Healthy Worker Survivor Effect. Mathematical Modelling, 7, 1393-1512. [Google Scholar] [CrossRef]
|
|
[5]
|
Bang, H. and Robins, J.M. (2005) Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics, 61, 962-973. [Google Scholar] [CrossRef] [PubMed]
|
|
[6]
|
van der Laan, M.J. and Rubin, D. (2006) Targeted Maximum Likelihood Learning. The International Journal of Biostatistics, 2, Article No. 213. [Google Scholar] [CrossRef]
|
|
[7]
|
van der Laan, M.J. and Rose, S. (2011) Targeted Learning: Causal Inference for Observational and Experimental Data. Springer.
|
|
[8]
|
Luque‐Fernandez, M.A., Schomaker, M., Rachet, B. and Schnitzer, M.E. (2018) Targeted Maximum Likelihood Estimation for a Binary Treatment: A Tutorial. Statistics in Medicine, 37, 2530-2546. [Google Scholar] [CrossRef] [PubMed]
|
|
[9]
|
McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models. 2nd Edition, Chapman and Hall.
|
|
[10]
|
Venables, W.N. and Ripley, B.D. (2002) Modern Applied Statistics with S. 4th Edition, Springer.
|
|
[11]
|
Cameron, A.C. and Trivedi, P.K. (2013) Regression Analysis of Count Data. 2nd Edition, Cambridge University Press. [Google Scholar] [CrossRef]
|