In this work we introduce reinforcement learning techniques for solving
...
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward fun...
We provide the first formal definition of reward hacking, a phenomenon w...
It's challenging to design reward functions for complex, real-world task...
In this paper I present an argument and a general schema which can be us...
Overparameterised deep neural networks (DNNs) are highly expressive and ...
Understanding the inductive bias of neural networks is critical to expla...
We analyze the type of learned optimization that occurs when a learned m...