Collaborative Double Robust Targeted Penalized Maximum Likelihood Estimation
Abstract
A new class of collaborative double robust targeted maximum likelihood estimators (C-DR-TMLE) targeting a particular parameter in a semiparametric model is proposed, building on the targeted maximum likelihood methodology of van der Laan and Rubin(2006). Targeted maximum likelihood estimation applies a targeted fluctuation function to a first stage (overall) density estimator and estimates the amount of fluctuation with parametric maximum likelihood estimation, treating the first stage density estimator as an offset. The optimal targeted fluctuation function typically depends on an unknown nuisance parameter.
In this article a fundamental further advance is achieved by generating a sequence of targeted maximum likelihood estimators with increasing likelihood indexed by increasingly nonparametric nuisance parameter estimators. Likelihood based cross-validation is used to select the nuisance parameter estimator for which the targeted maximum step yields the maximally effective bias reduction with respect to the target parameter.
A newly introduced collaborative double robustness of the efficient score equations solved by these targeted maximum likelihood estimators is shown to be superior to the current definition of double robustness in the estimating equation literature (e.g., Robins and Rotnitzky(2001), Robins et. al. (2000), Robins (200a), van der Laan and Robins(2003)), both in theory and in practice. As a consequence of this collaborative double robustness and maximum likelihood as the principal driving force, the resulting C-DR-TMLE is a more robust and optimal estimator of any pathwise differentiable parameter in any semi-parametric model than the current state of the art in double robust estimation.
In addition, a general strategy of penalizing the log-likelihood so that the selection among different candidate targeted maximum likelihood estimators becomes more targeted towards the parameter of interest is introduced as well, which is able to avoid breakdowns of the estimation procedure for borderline identifiable target parameters. This results in a class of collaborative double robust targeted {\em penalized} maximum likelihood estimators (C-DR-TPMLE).
The method is illustrated in the context of estimation of causal effects in marginal structural models. In addition, simulations for nonparametric causal effect estimation illustrate the gain in practical performance of the collaborative double robust targeted maximum likelihood machine learning algorithms relative to current competitors such as the double robust estimating equation methodology that relies on an external non-collaborative estimator of the nuisance parameter. We also provide comparisons with ad hoc popular estimation procedures such as propensity score matching and inverse probability of treatment weighting. We also apply a particular C-DR-TPMLE implementation to assess the effects of mutations in the HIV virus on drug resistance.
This research provides a template for targeted efficient and robust machine learning of a particular target feature of the probability distribution of the data within large (infinite dimensional) semi-parametric models, while still providing statistical inference in terms of confidence intervals and p-values.
Suggested Citation
Mark J. van der Laan and Susan Gruber. 2009. "Collaborative Double Robust Targeted Penalized Maximum Likelihood Estimation" U.C. Berkeley Division of Biostatistics Working Paper Series
Available at: http://works.bepress.com/sgruber/2