Robust Estimation of Heterogeneous Treatment Effects using Electronic Health Record Data
Estimation of heterogeneous treatment effects is an essential component of precision medicine. Model and algorithm-based methods have been developed within the causal inference framework to achieve valid estimation and inference. Existing methods such as the A-learner, R-learner, modified covariates method (with and without efficiency augmentation), inverse propensity score weighting, and augmented inverse propensity score weighting have been proposed mostly under the square error loss function. The performance of these methods in the presence of data irregularity and high dimensionality, such as that encountered in electronic health record (EHR) data analysis, has been less studied. In this research, we describe a general formulation that unifies many of the existing learners through a common score function. The new formulation allows the incorporation of least absolute deviation (LAD) regression and dimension reduction techniques to counter the challenges in EHR data analysis. We show that under a set of mild regularity conditions, the resultant estimator has an asymptotic normal distribution. Within this framework, we proposed two specific estimators for EHR analysis based on weighted LAD with penalties for sparsity and smoothness simultaneously. Our simulation studies show that the proposed methods are more robust to outliers under various circumstances. We use these methods to assess the blood pressure-lowering effects of two commonly used antihypertensive therapies.
READ FULL TEXT