Identifying and upsampling important frames from demonstration data can significantly boost imitation learning from histories, and scales easily to complex settings such as autonomous driving from vision.
Oct 8, 8080
We formulate homeostasis as an intrinsic motivation objective and show interesting emergent behavior from minimizing Bayesian surprise with RL across many environments.
Jan 30, 30300
We learn reward functions in unsupervised object keypoint space, to allow us to follow third-person demonstrations with model-based RL.
Oct 15, 15150
Oct 15, 15150
To plan towards long-term goals through visual prediction, we propose a model based on two key ideas: (i) predict in a goal-conditioned way to restrict planning only to useful sequences, and (ii) recursively decompose the goal-conditioned prediction task into an increasingly fine series of subgoals.
Jun 1, 1010
"Causal confusion", where spurious correlates are mistaken to be causes of expert actions, is commonly prevalent in imitation learning, leading to counterintuitive results where additional information can lead to worse task performance. How might one address this?
Dec 12, 12120
In visual prediction tasks, letting your predictive model choose which times to predict does two things: (i) improves prediction quality, and (ii) leads to semantically coherent "bottleneck state" predictions, which are useful for planning.
Jan 1, 1010