Task General Tracking#
Task-general tracking trains physics-based policies on retargeted GRAIL motion libraries: pick-and-place, manipulation, terrain-aware tracking, and locomotion. The training implementation lives in the imports/SONIC vendored release tree.
This page documents the GRAIL-specific pieces:
The four self-contained HOI release configs (
pnp_*,advanced_manip_*) used to train pick-and-place and manipulation policies.The shared
terrain_trackingconfig used for scene/terrain-aware tracking.
For lower-level training internals see imports/SONIC/README.md and imports/SONIC/docs/.
Install#
Same sonic env as retargeting:
bash scripts/setup/install_env_sonic.sh
This sets up IsaacLab, GMR, GRAIL, and the training stack in one conda env.
Checkpoints#
The base behavior-model warm-start and the task-specific reference bundles for finetuning are all fetched by the project-wide setup script:
bash scripts/setup/download_checkpoints.sh # all submodules
bash scripts/setup/download_checkpoints.sh \
--skip-gem-smpl --skip-gem-soma \
--skip-foundationpose --skip-hunyuan3d # SONIC only
This lands them under imports/SONIC/models/:
imports/SONIC/models/
├── sonic_manipulation_base/ # base warm-start for pickup and manipulation: last.pt + model_config.yaml
├── pnp_table/ # pickup table reference: last.pt + config.yaml
├── pnp_ground/ # pickup ground reference: last.pt + config.yaml
└── terrain_stairs/ # terrain stairs reference: last.pt + config.yaml
Store path references in experiment configs as relative to
imports/SONIC/, not to the GRAIL root — e.g. models/pnp_table/last.pt.
The training commands below assume cd imports/SONIC first, so those
relative paths resolve correctly.
Preparing retargeted data for training#
Retarget as in retargeting.md. The GRAIL retargeting pipeline writes a
<name>_ha/directory containingrobot/,objects/,object_usd/, andmeta/.Move the retargeted folder to
data/motion_lib/<name>_ha/— that prefix is what the training config loaders expect.For multi-object HOI sweeps, also place the BPS encodings under
data/motion_lib/<name>/bps/(BPS is data-only; it is not parameterized by the hand-action variant).For terrain-aware data, re-retarget with
--zero_out_wristto skip hand IK — see Terrain / sitting data.
Task-general tracking configs overview#
Current state of imports/SONIC/gear_sonic/config/exp/manager/universal_token/:
Path |
Purpose |
|---|---|
|
Shared height-map + object-state terrain-aware tracking config used by the |
|
Self-contained tabletop pick-up release config |
|
Self-contained ground pick-up release config |
|
Self-contained advanced-manipulation tabletop config |
|
Self-contained advanced-manipulation ground config |
The release configs share the same launch pattern: choose the Hydra config, then pass runtime data paths through Hydra overrides.
Quick smoke test#
A single-GPU, 4-env, 3-iteration run against the pnp_table release config —
enough to verify the install end-to-end. Completes in ~2 minutes on a single
L40. Set DATA_DIR and BPS_DIR to a retargeted motion library prepared as
described above.
conda activate sonic
export HYDRA_FULL_ERROR=1 PYTHONUNBUFFERED=1 WANDB_MODE=offline
cd imports/SONIC
python -u train_agent_trl.py \
+exp=manager/universal_token/hoi/pnp_table \
num_envs=4 headless=True \
++algo.config.num_learning_iterations=3 \
++manager_env.config.gpu_collision_stack_size_exp=28 \
++manager_env.commands.motion.motion_lib_cfg.motion_file=${DATA_DIR}/robot \
++manager_env.commands.motion.motion_lib_cfg.object_motion_file=${DATA_DIR}/objects \
++manager_env.config.object_usd_path=${DATA_DIR}/object_usd \
++manager_env.commands.motion.motion_lib_cfg.bps_dir=${BPS_DIR}
Running training#
Pick-up and advanced manipulation#
All four release configs share a single launch shape — only the Hydra config
name changes. Set DATA_DIR to your retargeted motion library (with
robot/, objects/, object_usd/ subdirs) and BPS_DIR to the matching
BPS encodings; see Preparing retargeted data for training
for the expected layout. The example script below will launch training with
a single node using 8 GPUs.
conda activate sonic
export HYDRA_FULL_ERROR=1 PYTHONUNBUFFERED=1
cd imports/SONIC
accelerate launch --num_processes=8 train_agent_trl.py \
+exp=${HYDRA_CONFIG} \
num_envs=2048 headless=True \
++manager_env.commands.motion.motion_lib_cfg.motion_file=${DATA_DIR}/robot \
++manager_env.commands.motion.motion_lib_cfg.object_motion_file=${DATA_DIR}/objects \
++manager_env.config.object_usd_path=${DATA_DIR}/object_usd \
++manager_env.commands.motion.motion_lib_cfg.bps_dir=${BPS_DIR}
Available release configs:
Sweep |
|
|---|---|
|
|
|
|
|
|
|
|
Pick-up and manipulation launch inputs:
Flag |
Effect |
|---|---|
|
Selects one release config from the table above. |
|
Robot-motion directory, usually |
|
Object-motion directory, usually |
|
Object USD directory, usually |
|
BPS encoding directory for multi-object pick-up / manipulation data. |
Finetuning a pick-up policy#
To continue from an existing pick-up run, use the matching reference
bundle (downloaded by download_checkpoints.sh above):
Config |
Reference config |
Reference checkpoint |
|---|---|---|
|
|
|
|
|
|
Use the reference config.yaml for provenance and evaluation; the finetune
command below selects the public release config and warm-resumes from
last.pt. Write the new run to a separate experiment_dir.
conda activate sonic
export HYDRA_FULL_ERROR=1 PYTHONUNBUFFERED=1
cd imports/SONIC
python -u train_agent_trl.py \
+exp=manager/universal_token/hoi/pnp_table \
num_envs=2048 headless=True \
++resume=True \
++checkpoint=models/pnp_table/last.pt \
experiment_dir=${FINETUNE_DIR} \
++algo.config.num_learning_iterations=10000 \
++manager_env.commands.motion.motion_lib_cfg.motion_file=${DATA_DIR}/robot \
++manager_env.commands.motion.motion_lib_cfg.object_motion_file=${DATA_DIR}/objects \
++manager_env.config.object_usd_path=${DATA_DIR}/object_usd \
++manager_env.commands.motion.motion_lib_cfg.bps_dir=${BPS_DIR}
Use +exp=manager/universal_token/hoi/pnp_ground and
++checkpoint=models/pnp_ground/last.pt for ground pick-up data.
Terrain-aware tracking#
The current GRAIL terrain-aware runs use one shared height-map + object-state
config. Set DATA_DIR to a retargeted dataset root with robot/, objects/,
and object_usd/.
conda activate sonic
export HYDRA_FULL_ERROR=1 PYTHONUNBUFFERED=1
cd imports/SONIC
python -u train_agent_trl.py \
+exp=manager/universal_token/scene/terrain_tracking \
num_envs=4096 headless=True \
++manager_env.commands.motion.motion_lib_cfg.motion_file=${DATA_DIR}/robot \
++manager_env.commands.motion.motion_lib_cfg.object_motion_file=${DATA_DIR}/objects \
++manager_env.config.terrain_motion_dir=${DATA_DIR}
Use the same config for all scene/terrain-aware datasets and pass only the
runtime data paths through Hydra overrides. If the active dataset root does
not include flat_placeholder.usd, set
++manager_env.config.flat_usd_path=<path> to a compatible placeholder USD.
Terrain launch inputs:
Flag |
Effect |
|---|---|
|
Dataset root with paired |
|
Optional — adds flat (non-terrain) motions, interleaved between terrain envs. |
|
Explicit placeholder USD for non-terrain envs. Falls back to |
|
Every |
The terrain path emits /tmp/rank_<R>_motion_keys.txt per GPU and logs
[TerrainAutoDiscover] / [PerRankUSD] / [PerRankMotion] during init —
grep those to confirm the slicer is doing what you expect.
Finetuning a terrain policy#
Terrain finetuning uses the same warm-resume pattern: keep
models/terrain_stairs/config.yaml with models/terrain_stairs/last.pt,
write to a new output directory, and point DATA_DIR at the next
terrain-aware motion-library partition.
The published reference bundle (models/terrain_stairs/) is the
stairs policy only — use it as the warm-start when finetuning on any
stairs-like dataset. Reference bundles for the other terrain types
(curb, slope, etc.) will be published in a future release.
conda activate sonic
export HYDRA_FULL_ERROR=1 PYTHONUNBUFFERED=1
cd imports/SONIC
python -u train_agent_trl.py \
+exp=manager/universal_token/scene/terrain_tracking \
num_envs=4096 headless=True \
++resume=True \
++checkpoint=models/terrain_stairs/last.pt \
experiment_dir=${FINETUNE_DIR} \
++algo.config.num_learning_iterations=20000 \
++manager_env.commands.motion.motion_lib_cfg.motion_file=${DATA_DIR}/robot \
++manager_env.commands.motion.motion_lib_cfg.object_motion_file=${DATA_DIR}/objects \
++manager_env.config.terrain_motion_dir=${DATA_DIR}
If the dataset root does not provide flat_placeholder.usd, also pass
++manager_env.config.flat_usd_path=<path-to-placeholder-usd>.
Multi-node accelerate template#
Replace the single-node launcher with the multi-node form. Same
train_agent_trl.py command and ${ARGS[@]}, just different launcher
flags. Example: 8 nodes × 8 GPUs = 64 GPUs.
accelerate launch \
--multi_gpu \
--num_machines=8 \
--num_processes=64 \
--machine_rank=$MACHINE_RANK \
--main_process_ip=$MASTER_ADDR \
--main_process_port=$MASTER_PORT \
train_agent_trl.py "${ARGS[@]}" num_envs=2048
See the Accelerate distributed training guide and multi-node launcher docs.
Output layout#
Each run writes to:
logs_rl/TRL_G1_Track/manager/<config_path>/<exp_name>-<timestamp>/
├── config.yaml # full resolved Hydra config
├── model_step_NNNNNN.pt # checkpoint every N iters (algo.config.save_every)
├── last.pt # symlink to the latest step
├── meta.yaml # wandb_id + misc provenance
└── events.out.tfevents.* # tensorboard (optional; wandb is primary)
W&B run name and project come from the +opt=wandb Hydra opt group
(gear_sonic/config/opt/wandb.yaml).
Evaluation#
See imports/SONIC docs for eval workflows. GRAIL does not add eval tooling — eval runs against the training checkpoint directory format directly.
Troubleshooting#
Symptom |
Cause / fix |
|---|---|
|
Retarget output is missing |
Wrong motion-lib format |
Verify |