Limits on Gradient Compression for Stochastic Optimization

01/24/2020
by   Prathamesh Mayekar, et al.
0

We consider stochastic optimization over ℓ_p spaces using access to a first-order oracle. We ask: What is the minimum precision required for oracle outputs to retain the unrestricted convergence rates? We characterize this precision for every p≥ 1 by deriving information theoretic lower bounds and by providing quantizers that (almost) achieve these lower bounds. Our quantizers are new and easy to implement. In particular, our results are exact for p=2 and p=∞, showing the minimum precision needed in these settings are Θ(d) and Θ(log d), respectively. The latter result is surprising since recovering the gradient vector will require Ω(d) bits.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset