Differentially Private Multi-Armed Bandits in the Shuffle Model

06/05/2021
by   Jay Tenenbaum, et al.
0

We give an (ε,δ)-differentially private algorithm for the multi-armed bandit (MAB) problem in the shuffle model with a distribution-dependent regret of O((∑_a∈ [k]:Δ_a>0log T/Δ_a)+k√(log1/δ)log T/ε), and a distribution-independent regret of O(√(kTlog T)+k√(log1/δ)log T/ε), where T is the number of rounds, Δ_a is the suboptimality gap of the arm a, and k is the total number of arms. Our upper bound almost matches the regret of the best known algorithms for the centralized model, and significantly outperforms the best known algorithm in the local model.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset