Reward functions are notoriously difficult to specify, especially for ta...
imitation provides open-source implementations of imitation and reward
l...
We attack the state-of-the-art Go-playing AI system, KataGo, by training...
In reinforcement learning, different reward functions can be equivalent ...
Self-play reinforcement learning has achieved state-of-the-art, and ofte...
In many real-world applications, the reward function is too complex to b...
Inverse Reinforcement Learning (IRL) algorithms infer a reward function ...
It's challenging to design reward functions for complex, real-world task...
Language models can learn a range of capabilities from unsupervised trai...
In many real-world tasks, it is not possible to procedurally specify an ...
The objective of many real-world tasks is complex and difficult to
proce...
For many tasks, the reward function is too complex to be specified
proce...
Deep reinforcement learning (RL) policies are known to be vulnerable to
...
Deep reinforcement learning achieves superhuman performance in a range o...
Reward design, the problem of selecting an appropriate reward function f...
Multi-task Inverse Reinforcement Learning (IRL) is the problem of inferr...