Learning High-level Representations from Demonstrations
Hierarchical learning (HL) is key to solving complex sequential decision problems with long horizons and sparse rewards. It allows learning agents to break-up large problems into smaller, more manageable subtasks. A common approach to HL, is to provide the agent with a number of high-level skills that solve small parts of the overall problem. A major open question, however, is how to identify a suitable set of reusable skills. We propose a principled approach that uses human demonstrations to infer a set of subgoals based on changes in the demonstration dynamics. Using these subgoals, we decompose the learning problem into an abstract high-level representation and a set of low-level subtasks. The abstract description captures the overall problem structure, while subtasks capture desired skills. We demonstrate that we can jointly optimize over both levels of learning. We show that the resulting method significantly outperforms previous baselines on two challenging problems: the Atari 2600 game Montezuma's Revenge, and a simulated robotics problem moving the ant robot through a maze.
READ FULL TEXT