In a broad class of reinforcement learning applications, stochastic rewa...
Mean-field games have been used as a theoretical tool to obtain an
appro...
Natural actor-critic (NAC) and its variants, equipped with the represent...
We consider the reinforcement learning problem for partially observed Ma...
In a wide variety of applications including online advertising, contract...
Natural policy gradient (NPG) methods with function approximation achiev...
We study the dynamics of temporal-difference learning with neural
networ...
Time-constrained decision processes have been ubiquitous in many fundame...
The theory of discrete-time online learning has been successfully applie...
We consider a budget-constrained bandit problem where each arm pull incu...
We study the problem of serving randomly arriving and delay-sensitive tr...