Pre- and post-contact policy decomposition for non-prehensile manipulation with zero-shot sim-to-real transfer
We present a system for non-prehensile manipulation that require a significant number of contact mode transitions and the use of environmental contacts to successfully manipulate an object to a target location. Our method is based on deep reinforcement learning which, unlike state-of-the-art planning algorithms, does not require apriori knowledge of the physical parameters of the object or environment such as friction coefficients or centers of mass. The planning time is reduced to the simple feed-forward prediction time on a neural network. We propose a computational structure, action space design, and curriculum learning scheme that facilitates efficient exploration and sim-to-real transfer. In challenging real-world non-prehensile manipulation tasks, we show that our method can generalize over different objects, and succeed even for novel objects not seen during training. Project website: https://sites.google.com/view/nonprenehsile-decomposition
READ FULL TEXT