A general theory of regression adjustment for covariate-adaptive randomization: OLS, Lasso, and beyond
We consider the problem of estimating and inferring treatment effects in randomized experiments. In practice, stratified randomization, or more generally, covariate-adaptive randomization, is routinely used in the design stage to balance the treatment allocations with respect to a few variables that are most relevant to the outcomes. Then, regression is performed in the analysis stage to adjust the remaining imbalances to yield more efficient treatment effect estimators. Building upon and unifying the recent results obtained for ordinary least squares adjusted estimators under covariate-adaptive randomization, this paper presents a general theory of regression adjustment that allows for arbitrary model misspecification and the presence of a large number of baseline covariates. We exemplify the theory on two Lasso-adjusted treatment effect estimators, both of which are optimal in their respective classes. In addition, nonparametric consistent variance estimators are proposed to facilitate valid inferences, which work irrespective of the specific randomization methods used. The robustness and improved efficiency of the proposed estimators are demonstrated through a simulation study and a clinical trial example. This study sheds light on improving treatment effect estimation efficiency by implementing machine learning methods in covariate-adaptive randomized experiments.
READ FULL TEXT