The Benefits of Probability-Proportional-to-Size Sampling in Cluster-Randomized Experiments
In a cluster-randomized experiment, treatment is assigned to clusters of individual units of interest–households, classrooms, villages, etc.–instead of the units themselves. The number of clusters sampled and the number of units sampled within each cluster is typically restricted by a budget constraint. Previous analysis of cluster randomized experiments under the Neyman-Rubin potential outcomes model of response have assumed a simple random sample of clusters. Estimators of the population average treatment effect (PATE) under this assumption are often either biased or not invariant to location shifts of potential outcomes. We demonstrate that, by sampling clusters with probability proportional to the number of units within a cluster, the Horvitz-Thompson estimator (HT) is invariant to location shifts and unbiasedly estimates PATE. We derive standard errors of HT and discuss how to estimate these standard errors. We also show that results hold for stratified random samples when samples are drawn proportionally to cluster size within each stratum. We demonstrate the efficacy of this sampling scheme using a simulation based on data from an experiment measuring the efficacy of the National Solidarity Programme in Afghanistan.
READ FULL TEXT