Meta-gradients provide a general approach for optimizing the meta-parame...
Learned models of the environment provide reinforcement learning (RL) ag...
We study reinforcement learning (RL) with no-reward demonstrations, a se...
QMIX is a popular Q-learning algorithm for cooperative MARL in the
centr...
Non-stationarity arises in Reinforcement Learning (RL) even in stationar...
In many real-world settings, a team of agents must coordinate its behavi...
Gradient-based methods for optimisation of objectives in stochastic sett...
In complex tasks, such as those with large combinatorial action spaces,
...
To be successful in real-world tasks, Reinforcement Learning (RL) needs ...
In the last few years, deep multi-agent reinforcement learning (RL) has
...
In multi-agent reinforcement learning, centralised policies can only be
...
In many real-world settings, a team of agents must coordinate their beha...
Combining deep model-free reinforcement learning with on-line planning i...
Cooperative multi-agent systems can be naturally used to model many real...
Many real-world problems, such as network packet routing and urban traff...