In this paper, we explore an approach to auxiliary task discovery in
rei...
A central challenge to applying many off-policy reinforcement learning
a...
Many off-policy prediction learning algorithms have been proposed in the...
Off-policy prediction – learning the value function for one policy from ...
Many reinforcement learning algorithms rely on value estimation. However...
Catastrophic forgetting remains a severe hindrance to the broad applicat...
It is still common to use Q-learning and temporal difference (TD)
learni...
Reinforcement learning systems require good representations to work well...
Using neural networks in the reinforcement learning (RL) framework has
a...
Emphatic Temporal Difference (ETD) learning has recently been proposed a...
This paper investigates the problem of online prediction learning, where...
We apply neural nets with ReLU gates in online reinforcement learning. O...
In this paper we present the first empirical study of the emphatic
tempo...