The Learning With Errors (LWE) problem is one of the major hard problems...
We consider a multi-armed bandit setting where, at the beginning of each...
Contextual bandit algorithms are widely used in domains where it is desi...
This paper studies privacy-preserving exploration in Markov Decision
Pro...
We study bandits and reinforcement learning (RL) subject to a conservati...
Contextual bandit is a general framework for online learning in sequenti...
Reinforcement learning algorithms are widely used in domains where it is...
Contextual bandit algorithms are applied in a wide range of domains, fro...
In many fields such as digital marketing, healthcare, finance, and robot...
While learning in an unknown Markov Decision Process (MDP), an agent sho...
Many popular reinforcement learning problems (e.g., navigation in a maze...
We consider the classical stochastic multi-armed bandit but where, from ...