We propose EB-TCε, a novel sampling rule for ε-best
arm identification i...
In fixed budget bandit identification, an algorithm sequentially observe...
We present the formalization of Doob's martingale convergence theorems i...
A Top Two sampling rule for bandit identification is a method which sele...
The problem of identifying the best arm among a collection of items havi...
Top Two algorithms arose as an adaptation of Thompson sampling to best a...
In pure-exploration problems, information is gathered sequentially to an...
Elimination algorithms for bandit identification, which prune the plausi...
We study the problem of the identification of m arms with largest means ...
In the fixed budget thresholding bandit problem, an algorithm sequential...
We study reward maximisation in a wide class of structured stochastic
mu...
We investigate an active pure-exploration setting, that includes best-ar...
Pure exploration (aka active testing) is the fundamental task of sequent...
We determine the sample complexity of pure exploration bandit problems w...
State of the art online learning procedures focus either on selecting th...
We consider the classical stochastic multi-armed bandit but where, from ...