Generalizing a causal effect: sensitivity analysis and missing covariates
While a randomized controlled trial (RCT) readily measures the average treatment effect (ATE), this measure may need to be shifted to generalize to a different population. Standard estimators of the target population treatment effect are based on the distributional shift in covariates, using inverse propensity sampling weighting (IPSW) or modeling response with the g-formula. However, these need covariates that are available both in the RCT and in an observational sample, which often qualifies very few of them. Here we analyze how the classic estimators behave when covariates are missing in at least one of the two datasets - RCT or observational. In line with general identifiability conditions, these estimators are consistent when including only treatment effect modifiers that are shifted in the target population. We compute the expected bias induced by a missing covariate, assuming Gaussian covariates and a linear model for the conditional ATE function. This enables sensitivity analysis for each missing covariate pattern. In addition, this method is particularly useful as it gives the sign of the expected bias. We also show that there is no gain imputing a partially-unobserved covariate. Finally we study the replacement of a missing covariate by a proxy, and the impact of imputation. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee Student/Teacher Achievement Ratio (STAR), and with a real-world example from the critical care medical domain.
READ FULL TEXT