Achieving human-like dexterity is a longstanding challenge in robotics, in part due to the complexity of planning and control for contact-rich systems. In reinforcement learning (RL), one popular approach has been to use massively-parallelized, domain-randomized simulations to learn a policy offline over a vast array of contact conditions, allowing robust sim-to-real transfer. Inspired by recent advances in real-time parallel simulation, this work considers instead the viability of online planning methods for contact-rich manipulation by studying the well-known in-hand cube reorientation task. We propose a simple architecture that employs a sampling-based predictive controller and vision-based pose estimator to search for contact-rich control actions online. We conduct thorough experiments to assess the real-world performance of our method, architectural design choices, and key factors for robustness, demonstrating that our simple sampling-based approach achieves performance comparable to prior RL-based works.
Our architecture has two parts.
A sampling-based predictive controller. The controller rolls out forward simulations of the system in parallel to compute some open-loop control spline, then while executing those motions, repeatedly replans in closed loop. We show that using the simple cross-entropy method (CEM), we can achieve remarkably expressive contact-rich motions on the cube rotation task.
A vision-based cube pose estimator. The estimator is further composed of three parts. First, a fine-tuned Resnet predicts keypoints corresponding to the corners of the cube based on RGBD images. Next, given known camera poses and a pinhole camera model, we use a factor-graph based fixed-lag smoother to estimate the cube pose corresponding to a set of images. Finally, because the smoother may result in estimates with non-negligible hand-cube penetration, we use a corrector to find a feasible cube pose estimate. The corrector is simply another simulation that maintains its own internal cube and hand states. However, the corrector imparts a virtual wrench onto its cube that attracts it towards the raw estimate from the smoother. This produces feasible cube states that resemble the smoother output.
Here, we summarize a few interesting findings. For the most fine-grained quantitative details, please see the paper.
DROP performs comparably to RL. Prior works like Dactyl and DeXtreme learn robust cube rotation policies using offline RL and mass domain randomization. It is surprising that our simple online planning approach achieves similar performance and is robust enough to perform well in the real world.
The CEM sampler outperforms predictive sampling and iLQR. Prior work has shown that the simple predictive sampling (PS) strategy is surprisingly effective on contact-rich planning tasks. Even though CEM is only slightly more complex, we find that it substantially outperforms PS, as well as the gradient-based iLQR planner, which struggles due to stiff contact dynamics (consistent with many prior observations). We systematically evaluate the robustness of these planners in our paper, and find that CEM is far more robust to both model and estimator error, which may explain its performance.
Performance is sensitive to number of threads. Unsurprisingly, adding more threads improves performance. However, in this work, we already use a server-grade CPU in order to plan with 120 threads. This motivates future work that may perform parallel contact-rich simulation in real time using GPUs, which could massively boost performance on all contact-rich tasks.
We have open-sourced the hardware setup for this paper, including a parts list, STL files for 3D printing parts, and written instructions explaining the assembly. The total cost of the parts (excluding tools, a computer, and a 3D printer) is under $3,500.
Download a zip of all these files by clicking here (or the download button at the top of this page).
@article{li2024_drop,
title={DROP: Dexterous Reorientation via Online Planning},
author={Albert H. Li, Preston Culbertson, Vince Kurtz, and Aaron D. Ames},
year={2024},
journal={arXiv preprint arXiv:2409.14562},
note={Available at: \url{https://arxiv.org/abs/2409.14562}},
}