Diversity-Preserving K-Armed Bandits, Revisited

10/05/2020
by   Hédi Hadiji, et al.
0

We consider the bandit-based framework for diversity-preserving recommendations introduced by Celis et al. (2019), who approached it mainly by a reduction to the setting of linear bandits. We design a UCB algorithm using the specific structure of the setting and show that it enjoys a bounded distribution-dependent regret in the natural cases when the optimal mixed actions put some probability mass on all actions (i.e., when diversity is desirable). Simulations illustrate this fact. We also provide regret lower bounds and briefly discuss distribution-free regret bounds.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset