We propose a new best-of-both-worlds algorithm for bandits with variably...
We study a K-armed bandit with delayed feedback and intermediate
observa...
We present a modified tuning of the algorithm of Zimmert and Seldin [202...
We present a new concentration of measure inequality for sums of indepen...
We consider online learning with feedback graphs, a sequential
decision-...
We present a new second-order oracle bound for the expected risk of a
we...
We derive improved regret bounds for the Tsallis-INF algorithm of Zimmer...
We propose an algorithm for stochastic and adversarial multiarmed bandit...
We propose a new algorithm for adversarial multi-armed bandits with
unre...
We investigate multiarmed bandits with delayed feedback, where the delay...
Existing guarantees in terms of rigorous upper bounds on the generalizat...
We provide an algorithm that achieves the optimal (up to constants) fini...
We introduce the factored bandits model, which is a framework for learni...
We derive an online learning algorithm with improved regret guarantees f...
We present a new strategy for gap estimation in randomized algorithms fo...
New ranking algorithms are continually being developed and refined,
nece...
We propose a new PAC-Bayesian bound and a way of constructing a hypothes...
Advice-efficient prediction with expert advice (in analogy to label-effi...
We present a set of high-probability inequalities that control the
conce...
We develop a coherent framework for integrative simultaneous analysis of...
We present two alternative ways to apply PAC-Bayesian analysis to sequen...
We formulate weighted graph clustering as a prediction problem: given a
...