Propensity score analysis with latent covariates: Measurement error bias correction using the covariate's posterior mean, aka the inclusive factor score
We address measurement error bias in propensity score (PS) analysis due to covariates that are latent variables. In the setting where latent covariate X is measured via multiple error-prone items W, PS analysis using several proxies for X -- the W items themselves, a summary score (mean/sum of the items), or the conventional factor score (cFS , i.e., predicted value of X based on the measurement model) -- often results in biased estimation of the causal effect, because balancing the proxy (between exposure conditions) does not balance X. We propose an improved proxy: the conditional mean of X given the combination of W, the observed covariates Z, and exposure A, denoted X_WZA. The theoretical support, which applies whether X is latent or not (but is unobserved), is that balancing X_WZA (e.g., via weighting or matching) implies balancing the mean of X. For a latent X, we estimate X_WZA by the inclusive factor score (iFS) -- predicted value of X from a structural equation model that captures the joint distribution of (X,W,A) given Z. Simulation shows that PS analysis using the iFS substantially improves balance on the first five moments of X and reduces bias in the estimated causal effect. Hence, within the proxy variables approach, we recommend this proxy over existing ones. We connect this proxy method to known results about weighting/matching functions (Lockwood & McCaffrey, 2016; McCaffrey, Lockwood, & Setodji, 2013). We illustrate the method in handling latent covariates when estimating the effect of out-of-school suspension on risk of later police arrests using Add Health data.
READ FULL TEXT