Sub-Gaussian Mean Estimation in Polynomial Time
We study polynomial time algorithms for estimating the mean of a random vector X in R^d from n independent samples X_1,...,X_n when X may be heavy-tailed. We assume only that X has finite mean μ and covariance Σ. In this setting, the radius of confidence intervals achieved by the empirical mean are large compared to the case that X is Gaussian or sub-Gaussian. In particular, for confidence δ > 0, the empirical mean has confidence intervals with radius of order √(TrΣ / δ n) rather than √(TrΣ /n ) + √(λ_(Σ) (1/δ) / n) from the Gaussian case. We offer the first polynomial time algorithm to estimate the mean with sub-Gaussian confidence intervals under such mild assumptions. Our algorithm is based on a new semidefinite programming relaxation of a high-dimensional median. Previous estimators which assumed only existence of O(1) moments of X either sacrifice sub-Gaussian performance or are only known to be computable via brute-force search procedures requiring (d) time.
READ FULL TEXT