Pointwise Bounds for Distribution Estimation under Communication Constraints
We consider the problem of estimating a d-dimensional discrete distribution from its samples observed under a b-bit communication constraint. In contrast to most previous results that largely focus on the global minimax error, we study the local behavior of the estimation error and provide pointwise bounds that depend on the target distribution p. In particular, we show that the ℓ_2 error decays with O(‖ p‖_1/2/n2^b∨1/n) (In this paper, we use a∨ b and a ∧ b to denote max(a, b) and min(a,b) respectively.) when n is sufficiently large, hence it is governed by the half-norm of p instead of the ambient dimension d. For the achievability result, we propose a two-round sequentially interactive estimation scheme that achieves this error rate uniformly over all p. Our scheme is based on a novel local refinement idea, where we first use a standard global minimax scheme to localize p and then use the remaining samples to locally refine our estimate. We also develop a new local minimax lower bound with (almost) matching ℓ_2 error, showing that any interactive scheme must admit a Ω( ‖ p ‖_(1+δ)/2/n2^b) ℓ_2 error for any δ > 0. The lower bound is derived by first finding the best parametric sub-model containing p, and then upper bounding the quantized Fisher information under this model. Our upper and lower bounds together indicate that the ℋ_1/2(p) = log(‖ p ‖_1/2) bits of communication is both sufficient and necessary to achieve the optimal (centralized) performance, where ℋ_1/2(p) is the Rényi entropy of order 2. Therefore, under the ℓ_2 loss, the correct measure of the local communication complexity at p is its Rényi entropy.
READ FULL TEXT