CHIP: Learning Adaptive Compliance for Humanoid Control through Hindsight Perturbation

Sirui Chen*,1,2   Zi-ang Cao*,1,3   Zhengyi Luo1   Fernando Castañeda1   Chenran Li1   Tingwu Wang1   Ye Yuan1  
Linxi "Jim" Fan1   C. Karen Liu‡,2   Yuke Zhu‡,1,3  
*Co-First Authors Contributed Equally | Project Leads | 1NVIDIA | 2Stanford | 3UT Austin
Stay tuned for more videos and code release in 1 month.

Video

Complex Real-World Tasks

CHIP enables humanoid robots to perform complex, long-horizon tasks that require adaptive compliance transitions — switching between compliant and stiff modes on-the-fly based on task demands.

Christmas Gift Delivery

Compliant but agile box holding -- maintain stable bimanual surface contact while running

Wipe & Write

Stiff cap open → Hybrid holding+writing → Compliant wiping

Phone Room Delivery

Compliant pulling → Stiff door open → Compliant box grasping

Waste Box Disposal

Compliant box holding + Leg control -- Step on lid paddle before dropping box into trashcan

Abstract

Recent progress in humanoid robots has unlocked agile locomotion skills including backflipping, running, and crawling. Yet it remains challenging for a humanoid robot to perform forceful manipulation tasks such as moving boxes, wiping, and pushing a cart. We propose humanoid natural Compliant control through Hindsight Perturbation (CHIP), a method that plugs into any humanoid motion tracking framework, allowing control over robot end-effector stiffness while still agile to track any reference motion. CHIP is easy to use and requires neither data augmentation nor additional reward tuning. We show that a generalist motion tracking controller trained with CHIP can finish a diverse set of forceful manipulation tasks, which require different end-effector compliance, such as multi-robot collaboration, wiping, and door opening.

Why Compliance Matters

Different tasks require different levels of compliance. Being too stiff or too compliant can both lead to failure. CHIP enables adaptive compliance control — steering compliance behavior based on task requirements.

We Need Compliance Mode for Box Lifting

Being too stiff causes unstable grasping. Compliance allows the robot to maintain stable contact.

Compliance Mode

Stable grasping with appropriate compliance

Stiff Mode

Unstable to maintain contact for two hands -- too stiff



We Also Need Stiff Mode for Other Tasks

Some tasks require stiff mode to generate sufficient force — compliance behavior alone is not enough.

Dumbbell Lifting

Full Cycle

Compliant → Stiff → Compliant

Compliant Mode

Cannot lift — too compliant/not strong enough

Stiff Mode

Strong force generation and accurate following


Large Object Re-orientation

Compliant Mode

Not strong enough — fails to re-orient

Stiff Mode

Suits for this task — successful re-orientation

Stiff Mode

Forceful flipping of a large box



Beyond Discrete Modes: Continuous Compliance Control

Heavy objects require a stiffer controller to generate sufficient force for lifting,
whereas a more compliant controller makes it easier to maintain stable contact.
We need to balance stiffness and compliance continuously based on task demands.

Different Contents, Different Compliance

Box with 2 Wipes

Compliance 10/k = 0.3

Stable grasping

Box with 2 Wipes + Dumbbell

Compliance 10/k = 0.05

Forceful lifting (~10 lbs)

Box with Wipe + Tape + Ball

Compliance 10/k = 0.4

Stable grasping


Summary: On-the-Fly Compliance Adjustment

This continuous demonstration shows how users can adjust compliance in real-time based on box contents:
Empty box (0.5) → 2 Wipes (0.3) → Wipes + Dumbbell (0.2)

Method

CHIP achieves continuous compliance-aware training through a fundamentally lighter approach: hindsight perturbation. Instead of synthesizing perturbation data via inverse kinematics (as in SoftMimic) or tuning additional reward terms, CHIP simply treats the original reference motion as if it were already the result of a perturbation and reconstructs sparse tracking targets on the fly during RL training. No reference motion modification, no data augmentation, no extra reward tuning — making CHIP significantly more scalable than prior methods.

CHIP Method Overview

AI-Predicted Compliance

Because CHIP's compliance command is grounded in a physically meaningful Hooke's law formulation, appropriate levels can be inferred by VLMs like Gemini from just a few user-provided anchor examples (e.g., 0.5 for an empty box, 0.2 for box with about 7 lbs of weight). Gemini can then predict suitable compliance for unseen objects directly from an image — making the tuning process intuitive and nearly effortless.

Box with Unseen Toys

Appropriate

Compliance 10/k = 0.475

Stable grasping with Gemini's prediction


Box with Water Bottle

Appropriate

Compliance 10/k = 0.425

Stable grasping

NOT Appropriate

Compliance 10/k = 0.475 (Too Soft)

Unstable — too compliant under perturbations


Box with Single Dumbbell

Appropriate

Compliance 10/k = 0.25

Stable grasping

NOT Appropriate

Compliance 10/k = 0.15 (Too Stiff)

Unstable grasping due to stiff and competing contact forces

NOT Appropriate

Compliance 10/k = 0.425 (Too Soft)

Unable to lift - insufficient force generation


More Gemini Predictions

Appropriate

Box with Shoe — Compliance 10/k = 0.475

Stable grasping

Appropriate

Box with Tape + Ball — Compliance 10/k = 0.475

Stable grasping

Human-Robot Interactions

When CHIP's compliance spectrum meets local vs. global tracking controllers, we unlock 3 distinct human-robot interaction modes: when compliance is low, the robot stiffly resists perturbation and accurately tracks the global reference; when compliance is high, the robot yields to perturbation while still returning to the global trajectory; and with a damper-only controller on top of CHIP, the robot follows human guidance freely with infinite compliance.

Stiff + Global Tracking

Low compliance — robot resists perturbation and tracks the global reference

Compliant + Global Tracking

High compliance — robot yields to perturbation while still tracking global reference

Compliant + Local Following

Damper-only controller — infinite compliance, robot follows human guidance freely

Multi-Robot Collaboration

In global tracking setting, CHIP enables compliant multi-robot grasping using SpringGrasp planning. Robots generate pre-grasp trajectories to approach objects collaboratively.

Collaborative Ball Grasping

CHIP (Compliant)

Successful collaborative grasp

No force policy

Fails to grasp — fail to apply enough force stably


Collaborative Box Grasping

CHIP (Compliant)

Successful collaborative grasp

Stiff Policy

Fails to grasp — stiff and competing contact forces


Collaborative Box Moving

After grasping, robots move the object together following keyboard commands.

Coordinated movement with compliant control

Autonomous VLA Policy

We finetune a Vision-Language-Action (VLA) model GR00T N1.5 on data collected with CHIP for autonomous manipulation tasks with adaptive compliance.

Small Whiteboard Wiping

Reactive continuous rollouts with compliant contact (w task success rate: 80%)

Large Whiteboard Wiping

A successful run with adaptive compliance (w task success rate: 60%)


Box Lifting

A successful run with adaptive compliance (w task success rate: 90%)

Box Lifting (Continuous Evaluation)

First 10 consecutive runs from 20 evaluations (w task success rate: 90%)

BibTeX

@article{chen_cao_2025chip,
    title={CHIP: Learning Adaptive Compliance for Humanoid Control through Hindsight Perturbation},
    author={Chen, Sirui and Cao, Zi-ang and Luo, Zhengyi and Castañeda, Fernando and Li, Chenran and Wang, Tingwu and Yuan, Ye and Fan, Linxi and Liu, C Karen and Zhu, Yuke},
    journal={arXiv preprint arXiv:2512.14689},
    year={2025}
}