Privately Answering Classification Queries in the Agnostic PAC Model
We revisit the problem of differentially private release of classification queries. In this problem, the goal is to design an algorithm that can accurately answer a sequence of classification queries based on a private training set while ensuring differential privacy. We formally study this problem in the agnostic PAC model and derive a new upper bound on the private sample complexity. Our results improve over those obtained in a recent work [BTT18] for the agnostic PAC setting. In particular, we give an improved construction that yields a tighter upper bound on the sample complexity. Moreover, unlike [BTT18], our accuracy guarantee does not involve any blow-up in the approximation error associated with the given hypothesis class. Given any hypothesis class with VC-dimension d, we show that our construction can privately answer up to m classification queries with average excess error α using a private sample of size ≈d/α^2(1, √(m)α^3/2). Using recent results on private learning with auxiliary public data, we extend our construction to show that one can privately answer any number of classification queries with average excess error α using a private sample of size ≈d/α^2(1, √(d)α). Our results imply that when α is sufficiently small (high-accuracy regime), the private sample size is essentially the same as the non-private sample complexity of agnostic PAC learning.
READ FULL TEXT