Streaming Motion Tracking#
Stream motion data to the robot over ZMQ for reference motion tracking. This interface supports streaming either SMPL-based poses (e.g., from PICO) or G1 whole-body joint positions (qpos) from any external source (--input-type zmq).
Prerequisites
Complete the Quick Start to have the sim2sim loop running.
Emergency Stop
Press O at any time to immediately stop control and exit. Always keep a hand near the keyboard ready to press O.
Launch#
Sim2Sim (MuJoCo):
# Terminal 1 — MuJoCo simulator (from repo root)
source .venv_sim/bin/activate
python gear_sonic/scripts/run_sim_loop.py
# Terminal 2 — C++ deployment (from gear_sonic_deploy/)
bash deploy.sh sim --input-type zmq \
--zmq-host <publisher-ip> \
--zmq-port 5556 \
--zmq-topic pose
Real Robot:
# From gear_sonic_deploy/
bash deploy.sh real --input-type zmq \
--zmq-host <publisher-ip> \
--zmq-port 5556 \
--zmq-topic pose
Step-by-Step#
Press
]to start the control system.By default you are in reference motion mode — use
Tto play motions,N/Pto switch,Rto restart (same as the keyboard interface).Press
ENTERto toggle into ZMQ streaming mode. The terminal will printZMQ STREAMING MODE: ENABLED.The policy now tracks motion frames arriving from the ZMQ publisher in real time. Playback starts automatically.
Press
ENTERagain to switch back to reference motions. The terminal will printZMQ STREAMING MODE: DISABLED, and the encode mode resets to0(joint-based).Use
Q/Eto adjust the heading (±0.1 rad per press) in either mode.Press
Ito reinitialize the base quaternion and reset the heading to zero.When done, press
Oto stop control and exit.
Note
No planner support — this interface uses pre-loaded and ZMQ-streamed reference motions only. For planner + ZMQ control (e.g., PICO VR teleoperation), use --input-type zmq_manager instead. See the VR Whole-Body Teleop tutorial.
Tip
Build your own streaming source. The ZMQ stream protocol documented below is self-contained — any publisher that sends messages in this format can drive the robot. You can write your own motion capture retargeting pipeline, simulator bridge, or any other source that produces the required fields. No PICO hardware is needed.
Using with PICO VR Teleop#
You can use --input-type zmq with the PICO teleop streamer for a simple, streaming-only whole-body teleoperation setup. In this mode, the PICO streams full-body SMPL poses over ZMQ and the deployment side tracks them directly — no locomotion planner, no PICO-button mode switching. All control is done from the keyboard.
Prerequisites#
Completed the Quick Start — you can run the sim2sim loop.
PICO VR hardware is set up — headset and controllers are connected, body tracking is working, and
.venv_teleopis installed. See the VR Teleop Setup for installation and calibration.
Launch (Sim2Sim)#
Run three terminals:
Terminal 1 — MuJoCo simulator (from repo root):
source .venv_sim/bin/activate
python gear_sonic/scripts/run_sim_loop.py
Terminal 2 — C++ deployment (from gear_sonic_deploy/):
bash deploy.sh sim --input-type zmq \
--zmq-host localhost \
--zmq-port 5556 \
--zmq-topic pose
Terminal 3 — PICO teleop streamer (from repo root):
source .venv_teleop/bin/activate
# With visualization (recommended for first run):
python gear_sonic/scripts/pico_manager_thread_server.py \
--manager --vis_smpl --vis_vr3pt
# Without visualization (headless):
# python gear_sonic/scripts/pico_manager_thread_server.py --manager
Launch (Real Robot)#
Run two terminals (no MuJoCo):
Terminal 1 — C++ deployment (from gear_sonic_deploy/):
bash deploy.sh real --input-type zmq \
--zmq-host <teleop-machine-ip> \
--zmq-port 5556 \
--zmq-topic pose
Replace <teleop-machine-ip> with localhost if the PICO streamer runs on the same machine, or the IP of the machine running Terminal 2.
Terminal 2 — PICO teleop streamer (from repo root):
source .venv_teleop/bin/activate
python gear_sonic/scripts/pico_manager_thread_server.py --manager
Step-by-Step#
Calibration pose: Stand upright, feet together, upper arms at your sides, forearms bent 90° forward (L-shape at each elbow), palms facing inward.
On the PICO controllers, press A + B + X + Y simultaneously to initialize and calibrate the body tracking.
Press A + X on the PICO controllers to start streaming poses.
In Terminal 2 (C++ deployment), press
]to start the control system.In the MuJoCo window (sim only), press
9to drop the robot to the ground.Back in Terminal 2, press
ENTERto enable ZMQ streaming. The terminal printsZMQ STREAMING MODE: ENABLED. The robot begins tracking your PICO poses in real time.Move your body — the robot mirrors your motions. Use the Trigger button on each PICO controller to close the corresponding robot hand.
To pause streaming (e.g., to reposition yourself), press
ENTERagain. The terminal printsZMQ STREAMING MODE: DISABLED. The robot holds its last pose and stops tracking. You can move freely without affecting the robot.To resume, press
ENTERonce more. The robot will snap to your current pose — move back close to the robot’s current pose before resuming to avoid sudden jumps.When done, press
Oto stop control and exit.
DANGER — Resuming from Pause
When you press ENTER to resume streaming after a pause, the robot will immediately try to reach your current physical pose. If your body is in a very different position from the robot, the robot may perform sudden, aggressive motions. Always move back close to the robot’s current pose before pressing ENTER to resume.
Controls#
Key |
Action |
|---|---|
] |
Start control system |
O |
Stop control and exit (emergency stop) |
ENTER |
Toggle between reference motions and ZMQ streaming |
I |
Reinitialize base quaternion and reset heading |
Q / E |
Adjust delta heading left / right (±0.1 rad) |
Reference motion mode only (streaming off):
Key |
Action |
|---|---|
T |
Play current motion to completion |
R |
Restart current motion from beginning (pause at frame 0) |
P / N |
Previous / Next motion sequence |
Stream Protocol Versions#
The encode mode is determined automatically by the ZMQ stream protocol version. SONIC uses Protocol v1 and v3. Protocol v2 is available for custom applications.
Encode Mode Logic#
The encode mode only takes effect when the policy model has an encoder configured and loaded. At startup, each motion’s encode mode is initialized based on encoder availability:
|
Meaning |
|---|---|
|
No encoder / token state configured in the model — encode mode has no effect |
|
Encoder config exists (token state dimension > 0) but no encoder model file provided |
|
Encoder loaded, joint-based mode (default) |
|
Encoder loaded, teleop / 3 points upper-body mode |
|
Encoder loaded, SMPL-based mode |
When ZMQ streaming is active, the protocol version sets the encode mode on the streamed motion: v1 → 0, v2/v3 → 2. This only affects inference if the model actually has an encoder (encode_mode >= 0). If no encoder is configured (-2), the value is set but has no effect on the inference pipeline.
When switching back to reference motions (pressing ENTER to disable streaming), the encode mode resets to 0 (if the motion has an encoder, i.e. encode_mode >= 0).
Common Fields (All Versions)#
All versions require two common fields:
Field |
Shape |
Dtype |
Description |
|---|---|---|---|
|
|
|
Body quaternion(s) per frame (w, x, y, z) |
|
|
|
Monotonically increasing frame indices for alignment |
Warning
Changing the protocol version mid-session is not allowed. If the publisher switches protocol versions while streaming, the interface will automatically disable ZMQ mode and return to reference motions for safety.
Error message: Protocol version changed from X to Y during active ZMQ session!
Protocol v1 — Joint-Based (Encode Mode 0)#
Streams raw G1 joint positions and velocities. Use this when your source provides direct qpos/qvel data (e.g., from another simulator or motion capture retargeting pipeline).
Required fields:
Field |
Shape |
Dtype |
Description |
|---|---|---|---|
|
|
|
Joint positions in IsaacLab order (all 29 joints) |
|
|
|
Joint velocities in IsaacLab order (all 29 joints) |
N= number of frames per message (batch size).All 29 joint values must be provided and meaningful.
Frame counts of
joint_posandjoint_velmust match.
Common errors:
Version 1 missing required fields (joint_pos, joint_vel)— one or both fields are absent.Frame count mismatch between joint_pos and joint_vel— theNdimension differs.
Protocol v2 — SMPL-Based (Encode Mode 2)#
Streams SMPL body model data. This protocol is not used by SONIC’s built-in pipelines — it is available for your own custom applications that produce SMPL representations, for example a plicy only observe the SMPL.
Required fields:
Field |
Shape |
Dtype |
Description |
|---|---|---|---|
|
|
|
SMPL joint positions (24 joints × xyz) |
|
|
|
SMPL joint rotations in axis-angle (21 body poses × xyz) |
joint_posandjoint_velare optional in v2.
Common errors:
Version 2 missing required field 'smpl_joints'or'smpl_pose'— required SMPL fields are absent.
Protocol v3 — Joint + SMPL Combined (Encode Mode 2)#
Combines both joint-level and SMPL data. This is what SONIC uses for whole-body teleoperation (e.g., PICO VR).
Required fields:
Field |
Shape |
Dtype |
Description |
|---|---|---|---|
|
|
|
Joint positions in IsaacLab order |
|
|
|
Joint velocities in IsaacLab order |
|
|
|
SMPL joint positions (24 joints × xyz) |
|
|
|
SMPL joint rotations in axis-angle (21 body poses × xyz) |
Important
In Protocol v3, only the 6 wrist joints need meaningful values in joint_pos — the remaining 23 joints can be zero. The wrist joint indices (in IsaacLab order) are: [23, 24, 25, 26, 27, 28] (3 joints per wrist × 2 wrists). The joint_vel values for non-wrist joints can also be zero.
The SMPL fields (smpl_joints, smpl_pose) carry the primary motion data in v3; the wrist joints in joint_pos provide fine-grained wrist control that SMPL alone cannot capture.
Frame counts across all four fields must be consistent.
Common errors:
Version 3 missing required field 'joint_pos'or'joint_vel'— joint fields are absent (unlike v2, they are required in v3).Version 3 frame count mismatch between smpl_joints (X) and joint_pos (Y)— theNdimension differs across fields.
Protocol Summary#
Protocol |
Encode Mode |
Used by SONIC |
Required Fields |
|---|---|---|---|
v1 |
|
✅ Yes |
|
v2 |
|
❌ Custom only |
|
v3 |
|
✅ Yes |
|
Optional Stream Fields#
The following optional fields can be included in any protocol version:
Field |
Shape |
Dtype |
Description |
|---|---|---|---|
|
|
|
Left hand 7-DOF Dex3 joint positions |
|
|
|
Right hand 7-DOF Dex3 joint positions |
|
|
|
VR 3-point tracking positions: left wrist, right wrist, head (xyz × 3) |
|
|
|
VR 3-point orientations: left, right, head quaternions (wxyz × 3) |
|
scalar |
|
If |
|
scalar |
|
Incremental heading adjustment applied per message |
Configuration#
Flag |
Default |
Description |
|---|---|---|
|
|
ZMQ publisher host |
|
|
ZMQ publisher port |
|
|
ZMQ topic prefix |
|
off |
Keep only the latest message (drop stale frames) |