Real-world deployment of COMPASS policy on real robots: Carter and G1.
As robots are increasingly deployed in diverse application domains, enabling robust mobility across different embodiments has become a critical challenge. Classical mobility stacks, though effective on specific platforms, require extensive per-robot tuning and do not scale easily to new embodiments. Learning-based approaches, such as imitation learning (IL), offer alternatives, but face significant limitations on the need for high-quality demonstrations for each embodiment.
To address these challenges, we introduce COMPASS, a unified framework that enables scalable cross-embodiment mobility using expert demonstrations from only a single embodiment. We first pre-train a mobility policy on a single robot using IL, combining a world model with a policy model. We then apply residual reinforcement learning (RL) to efficiently adapt this policy to diverse embodiments through corrective refinements. Finally, we distill specialist policies into a single generalist policy conditioned on an embodiment embedding vector. This design significantly reduces the burden of collecting data while enabling robust generalization across a wide range of robot designs. Our experiments demonstrate that COMPASS scales effectively across diverse robot platforms while maintaining adaptability to various environment configurations, achieving a generalist policy with a success rate approximately 5X higher than the pre-trained IL policy, and further demonstrates zero-shot sim-to-real transfer.
COMPASS introduces a novel three-stage learning workflow for developing cross-embodiment mobility policies:
This approach enables efficient transfer of mobility skills across diverse robot platforms while maintaining adaptability to various environment configurations.
Extensive experiments demonstrate that COMPASS can achieves robust generalization across diverse robot platforms while preserving the adaptability needed to succeed in varied environments. Quantitatively, the RL specialists and the distilled generalist policy can achieve a 5X higher success rate and 3X lower travel time on average than the pre-trained IL policy (X-Mobility).
COMPASS policies trained in simulation can be directly deployed on real robots, demonstrating strong sim2real transfer capabilities without additional fine-tuning.
Real-world deployment of COMPASS policy on real robots: Carter and G1.
Open Vocabulary Object Navigation by integrating Locate3D with COMPASS.
COMPASS policy performing open vocabulary object navigation by integrating Locate3D.
GROOT Post-training with COMPASS distillation datasets, enabling navigation capabilities.
GROOT post-training with COMPASS datasets for navigation.
@article{liu2025compass,
title={COMPASS: Cross-embodiment Mobility Policy via Residual RL and Skill Synthesis},
author={Liu, Wei and Zhao, Huihua and Li, Chenran and Deng, Yuchen and Biswas, Joydeep and Pouya, Soha and Chang, Yan},
journal={arXiv preprint arXiv:2502.16372},
year={2025}
}