Learning and Testing Sub-groups with Heterogeneous Treatment Effects:A Sequence of Two Studies
There is strong interest in estimating how the magnitude of treatment effects of an intervention vary across sub-groups of the population of interest. In our paper, we propose a two-study approach to first propose and then test heterogeneous treatment effects. In Study 1, we use a large observational dataset to learn sub-groups with the most distinctive treatment-outcome relationships ('high/low-impact sub-groups'). We adopt a model-based recursive partitioning approach to propose the high/low impact sub-groups, and validate them by using sample-splitting. While the first study rules out noise, there is potential bias in our estimated heterogeneous treatment effects. Study 2 uses an experimental design, and here we classify our sample units based on sub-groups learned in Study 1. We then estimate treatment effects within each of the groups, thereby testing the causal hypotheses proposed in Study 1. Using patient claims data from the NBER MarketScan database, we apply our approach to estimate heterogeneous effects of a switch to a high-deductible health insurance plan on use of outpatient care by patients with a common chronic condition. We extend the method to non-parametrically learn the sub-groups in Study 1. We also compare the methods' performance to other state-of-the-art methods in the literature that make use only of the Study 2 data.
READ FULL TEXT