Assessing Evolutionary Terrain Generation Methods for Curriculum Reinforcement Learning

03/29/2022
by   David Howard, et al.
11

Curriculum learning allows complex tasks to be mastered via incremental progression over `stepping stone' goals towards a final desired behaviour. Typical implementations learn locomotion policies for challenging environments through gradual complexification of a terrain mesh generated through a parameterised noise function. To date, researchers have predominantly generated terrains from a limited range of noise functions, and the effect of the generator on the learning process is underrepresented in the literature. We compare popular noise-based terrain generators to two indirect encodings, CPPN and GAN. To allow direct comparison between both direct and indirect representations, we assess the impact of a range of representation-agnostic MAP-Elites feature descriptors that compute metrics directly from the generated terrain meshes. Next, performance and coverage are assessed when training a humanoid robot in a physics simulator using the PPO algorithm. Results describe key differences between the generators that inform their use in curriculum learning, and present a range of useful feature descriptors for uptake by the community.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset