Skip to main content
Article
A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure
Statistical Methods in Medical Research (2018)
  • Laura B Balzer, University of Massachusetts Amherst
  • Wenjing Zheng, University of California, Berkeley
  • Mark J. van der Laan
  • Maya Petersen, University of California, Berkeley
Abstract
We often seek to estimate the impact of an exposure naturally occurring or randomly assigned at the cluster-level. For example, the literature on neighborhood determinants of health continues to grow. Likewise, community randomized trials are applied to learn about real-world implementation, sustainability, and population effects of interventions with proven individual-level efficacy. In these settings, individual-level outcomes are correlated due to shared cluster-level factors, including the exposure, as well as social or biological interactions between individuals. To flexibly and efficiently estimate the effect of a cluster-level exposure, we present two targeted maximum likelihood estimators (TMLEs). The first TMLE is developed under a non-parametric causal model, which allows for arbitrary interactions between individuals within a cluster. These interactions include direct transmission of the outcome (i.e. contagion) and influence of one individual’s covariates on another’s outcome (i.e. covariate interference). The second TMLE is developed under a causal sub-model assuming the cluster-level and individual-specific covariates are sufficient to control for confounding. Simulations compare the alternative estimators and illustrate the potential gains from pairing individual-level risk factors and outcomes during estimation, while avoiding unwarranted assumptions. Our results suggest that estimation under the sub-model can result in bias and misleading inference in an observational setting. Incorporating working assumptions during estimation is more robust than assuming they hold in the underlying causal model. We illustrate our approach with an application to HIV prevention and treatment.
Keywords
  • Cluster-level exposures,
  • cluster randomized trials,
  • contagion,
  • double robust,
  • hierarchical,
  • interference,
  • multilevel,
  • semi-parametric,
  • Super Learner,
  • targeted maximum likelihood estimation (TMLE)
Publication Date
Summer 2018
DOI
https://doi.org/10.1177%2F0962280218774936
Citation Information
Laura B Balzer, Wenjing Zheng, Mark J. van der Laan and Maya Petersen. "A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure" Statistical Methods in Medical Research (2018)
Available at: http://works.bepress.com/laura_balzer/53/