We learn reward functions in unsupervised object keypoint space, to allow us to follow third-person demonstrations with model-based RL.
Oct 15, 15150