Eurekaverse: Environment Curriculum Generation via Large Language Models

Sep 1, 1010·

Will Liang

Sam Wang

Hungju Wang

Osbert Bastani

Dinesh Jayaraman*

Yecheng Jason Ma*

· 0 min read

Cite arXiv Webpage Code

Abstract

Recent work has demonstrated that a promising strategy for teaching robots a wide range of complex skills is by training them on a curriculum of progressively more challenging environments. However, developing an effective curriculum of environment distributions currently requires significant expertise, which must be repeated for every new domain. Our key insight is that environments are often naturally represented as code. Thus, we probe whether effective environment curriculum design can be achieved and automated via code generation by large language models (LLM). In this paper, we introduce Eurekaverse, an unsupervised environment design algorithm that uses LLMs to sample progressively more challenging, diverse, and learnable environments for skill training. We validate Eurekaverse’s effectiveness in the domain of quadrupedal parkour learning, in which a quadruped robot must traverse through a variety of obstacle courses. The automatic curriculum designed by Eurekaverse enables gradual learning of complex parkour skills in simulation and can successfully transfer to the real-world, outperforming manual training courses designed by humans.

Type

Publication

CORL (oral)

Last updated on Sep 1, 1010

← Task-Oriented Hierarchical Object Decomposition for Visuomotor Control Sep 2, 2020

Recasting Generic Pretrained Vision Transformers As Object-Centric Scene Encoders For Manipulation Policies Jan 25, 25250 →