Is Monte Carlo a bad sampling strategy for learning smooth functions in high dimensions?
This paper concerns the approximation of smooth, high-dimensional functions from limited samples using polynomials. This task lies at the heart of many applications in computational science and engineering – notably, those arising from parametric modelling and uncertainty quantification. It is common to use Monte Carlo (MC) sampling in such applications, so as not to succumb to the curse of dimensionality. However, it is well known this strategy is theoretically suboptimal. There are many polynomial spaces of dimension n for which the sample complexity scales log-quadratically in n. This well-documented phenomenon has led to a concerted effort to design improved, in fact, near-optimal strategies, whose sample complexities scale log-linearly, or even linearly in n. Paradoxically, in this work we show that MC is actually a perfectly good strategy in high dimensions. We first document this phenomenon via several numerical examples. Next, we present a theoretical analysis that resolves this paradox for holomorphic functions of infinitely-many variables. We show that there is a least-squares scheme based on m MC samples whose error decays algebraically fast in m/log(m), with a rate that is the same as that of the best n-term polynomial approximation. This result is non-constructive, since it assumes knowledge of a suitable polynomial space in which to perform the approximation. We next present a compressed sensing-based scheme that achieves the same rate, except for a larger polylogarithmic factor. This scheme is practical, and numerically it performs as well as or better than well-known adaptive least-squares schemes. Overall, our findings demonstrate that MC sampling is eminently suitable for smooth function approximation when the dimension is sufficiently high. Hence the benefits of improved sampling strategies are generically limited to lower-dimensional settings.
READ FULL TEXT