We study the problem of planning under model uncertainty in an online
me...
We study a sequential decision problem where the learner faces a sequenc...
We build on the recently proposed EigenGame that views eigendecompositio...
We introduce a computationally efficient algorithm for finite stochastic...
This note proposes a new proof and new perspectives on the so-called
Ell...
We present a novel view on principal component analysis (PCA) as a
compe...
We consider off-policy evaluation in the contextual bandit setting for t...
Significant work has been recently dedicated to the stochastic delayed b...
Online recommender systems often face long delays in receiving feedback,...
Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple fram...
We consider a stochastic linear bandit model in which the available acti...
Delayed feedback is an ubiquitous problem in many industrial systems
emp...
This paper is devoted to the study of the max K-armed bandit problem, wh...
The probability that a user will click a search result depends both on i...
We propose stochastic rank-1 bandits, a class of online learning problem...
Recommending items to users is a challenging task due to the large amoun...