Our proposed framework consists of (1) a higher-level policy for object-centric geometric and kinematic reasoning and (2) a lower-level policy for dynamic contact planning. The higher-level policy, trained using RL, predicts contact intention, which parameterizes the lower-level policy. The lower-level policy seeks to realize the contact intention through local contact-implicit model predictive control (MPC). The output of the lower-level policy is the robot action command.
In this task, the goal is to learn a unified policy that pushes a diverse set of letter-shaped objects from arbitrary initial poses to arbitrary planar target pose (position and orientation). 12 English letters are used in total: 6 letters (E, H, L, N, T, X) are used for training, and the remaining 6 (F, I, K, V, Y, Z) are held out for evaluation.












In this task, the goal is to reorient a cube from a randomized initial pose to a randomized 3D target pose.




















@article{TBD,
author = {Xie, Zhixian and Xiang, Yu and Michael Posa and Jin, Wanxin},
title = {Where to Touch, How to Contact: A Hierarchical RL–MPC Framework for Geometry-Aware Long-Horizon Dexterous Manipulation},
journal = {tbd},
year = {2026},
}