Scalable Uncertainty Quantification via GenerativeBootstrap Sampler
It has been believed that the virtue of using statistical procedures is on uncertainty quantification in statistical decisions, and the bootstrap method has been commonly used for this purpose. However, nowadays as the size of data massively increases and statistical models become more complicated, the implementation of bootstrapping turns out to be practically challenging due to its repetitive nature in computation. To overcome this issue, we propose a novel computational procedure called Generative Bootstrap Sampler (GBS), which constructs a generator function of bootstrap evaluations, and this function transforms the weights on the observed data points to the bootstrap distribution. The GBS is implemented by one single optimization, without repeatedly evaluating the optimizer of bootstrapped loss function as in standard bootstrapping procedures. As a result, the GBS is capable of reducing computational time of bootstrapping by hundreds of folds when the data size is massive. We show that the bootstrapped distribution evaluated by the GBS is asymptotically equivalent to the conventional counterpart and empirically they are indistinguishable. We examine the proposed idea to bootstrap various models such as linear regression, logistic regression, Cox proportional hazard model, and Gaussian process regression model, quantile regression, etc. The results show that the GBS procedure is not only accelerating the computational speed, but it also attains a high level of accuracy to the target bootstrap distribution. Additionally, we apply this idea to accelerate the computation of other repetitive procedures such as bootstrapped cross-validation, tuning parameter selection, and permutation test.
READ FULL TEXT