The Price of Differential Privacy For Online Learning

01/27/2017
by   Naman Agarwal, et al.
0

We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal Õ(√(T)) regret bounds. In the full-information setting, our results demonstrate that ϵ-differential privacy may be ensured for free -- in particular, the regret bounds scale as O(√(T))+Õ(1/ϵ). For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of Õ(1/ϵ√(T)), while the previously known best regret bound was Õ(1/ϵT^2/3).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset