In reinforcement learning (RL), state representations are key to dealing...
Multi-step learning applies lookahead over multiple time steps and has p...
We analyse quantile temporal-difference learning (QTD), a distributional...
Reward is the driving force for reinforcement-learning agents. This pape...
Credit assignment in reinforcement learning is the problem of measuring ...
Reinforcement learning is a powerful learning paradigm in which agents c...
We consider the problem of efficient credit assignment in reinforcement
...
The principal contribution of this paper is a conceptual framework for
o...
In this work, we consider the problem of autonomously discovering behavi...
A temporally abstract action, or an option, is specified by a policy and...
Many real-world reinforcement learning problems have a hierarchical natu...
In this work, we take a fresh look at some old and new algorithms for
of...
We propose and analyze an alternate approach to off-policy multi-step
te...
Potential-based reward shaping (PBRS) is an effective and popular techni...
Recent advances of gradient temporal-difference methods allow to learn
o...