A super-polynomial lower bound for learning nonparametric mixtures

03/28/2022
by   Bryon Aragam, et al.
0

We study the problem of learning nonparametric distributions in a finite mixture, and establish a super-polynomial lower bound on the sample complexity of learning the component distributions in such models. Namely, we are given i.i.d. samples from f where f=∑_i=1^k w_i f_i, ∑_i=1^k w_i=1, w_i>0 and we are interested in learning each component f_i. Without any assumptions on f_i, this problem is ill-posed. In order to identify the components f_i, we assume that each f_i can be written as a convolution of a Gaussian and a compactly supported density ν_i with supp(ν_i)∩supp(ν_j)=∅. Our main result shows that Ω((1/ε)^Cloglog1/ε) samples are required for estimating each f_i. The proof relies on a fast rate for approximation with Gaussians, which may be of independent interest. This result has important implications for the hardness of learning more general nonparametric latent variable models that arise in machine learning applications.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset