This paper studies the adversarial graphical contextual bandits, a varia...
We consider reinforcement learning (RL) in episodic Markov decision proc...
Cascading bandit (CB) is a variant of both the multi-armed bandit (MAB) ...
We investigate the piecewise-stationary combinatorial semi-bandit proble...
We propose a hypergraph-based active learning scheme which we term HS^2,...