Towards Optimal Variance Reduction in Online Controlled Experiments
We study the optimal variance reduction solutions for online controlled experiments by applying flexible machine learning tools to incorporate covariates that are independent from the treatment but have predictive power for the outcomes. Employing cross-fitting, we propose variance reduction procedures for both count metrics and ratio metrics in online experiments based on which the inference of the estimands are valid under mild convergence conditions. We also establish the asymptotic optimality of all these procedures under consistency condition of the machine learning estimators. In complement to the proposed nonlinear optimal procedure, a linear adjustment method for ratio metrics is also derived as a special case that is computationally efficient and can flexibly incorporate any pre-treatment covariates. Comprehensive simulation studies are performed and practical suggestions are given. When tested on real online experiment data from LinkedIn, the proposed optimal procedure for ratio metrics can reduce up to 80% of variance compared to the standard difference-in-mean estimator and also further reduce up to 30% of variance compared to the CUPED approach by going beyond linearity and incorporating a large number of extra covariates.
READ FULL TEXT