CONQ: CONtinuous Quantile Treatment Effects for Large-Scale Online Controlled Experiments
In many industry settings, online controlled experimentation (A/B test) has been broadly adopted as the gold standard to measure product or feature impacts. Most research has primarily focused on user engagement type metrics, specifically measuring treatment effects at mean (average treatment effects, ATE), and only a few have been focusing on performance metrics (e.g. latency), where treatment effects are measured at quantiles. Measuring quantile treatment effects (QTE) is challenging due to the myriad difficulties such as dependency introduced by clustered samples, scalability issues, density bandwidth choices, etc. In addition, previous literature has mainly focused on QTE at some pre-defined locations, such as P50 or P90, which doesn't always convey the full picture. In this paper, we propose a novel scalable non-parametric solution, which can provide a continuous range of QTE with point-wise confidence intervals while circumventing the density estimation altogether. Numerical results show high consistency with traditional methods utilizing asymptotic normality. An end-to-end pipeline has been implemented at Snap Inc., providing daily insights on key performance metrics at a distributional level.
READ FULL TEXT