Inference under Information Constraints I: Lower Bounds from Chi-Square Contraction
We consider a distributed inference problem where only limited information is allowed from each sample. We present a general framework where multiple players are given one independent sample each about which they can only provide limited information to a central referee. Motivated by important instances of communication and privacy constraints, in our abstraction we allow each player to describe its observed sample to the referee using a channel from a prespecified family of channels W. This family W can be instantiated to capture both the communication- and privacy-constrained settings and beyond. The central referee uses the messages from players to solve an inference problem for the unknown underlying distribution that generated samples of the players. We derive lower bounds for sample complexity of learning and testing discrete distributions in this information-constrained setting. Underlying our lower bounds is a quantitative characterization of the contraction in chi-square distances between the observed distributions of the samples when an information constraint is placed. This contraction is captured in a local neighborhood in terms of chi-square and decoupled chi-square fluctuations of a given channel, two quantities we introduce in this work. The former captures the average distance between two product distributions and the latter the distance of a product distribution to a mixture of product distributions. These quantities may be of independent interest. As a corollary, we quantify the sample complexity blow-up in the learning and testing settings when enforcing the corresponding local information constraints. In addition, we systematically study the role of randomness and consider both private- and public-coin protocols.
READ FULL TEXT