Configuration System#

Note

ProtoMotions3 uses Python-based dataclass configs, changing from Hydra or YAML inheritance in previous versions. This design provides IDE autocomplete, type checking, easier debugging, and keeps all config logic in one place (the experiment file).

Design#

Why Python dataclasses over Hydra/YAML?

IDE support: Full autocomplete and type hints
No inheritance complexity: No manual tracking which YAML overrides what
Readable: Config logic is explicit Python code

Experiment File Structure#

Each experiment file (e.g., examples/experiments/mimic/mlp.py) defines a complete training configuration through a set of functions:

# Required functions
def terrain_config(args) -> TerrainConfig:
    """Build terrain configuration."""
    return TerrainConfig()

def scene_lib_config(args) -> SceneLibConfig:
    """Build scene library configuration."""
    return SceneLibConfig(scene_file=args.scenes_file)

def motion_lib_config(args) -> MotionLibConfig:
    """Build motion library configuration."""
    return MotionLibConfig(motion_file=args.motion_file)

def env_config(robot_cfg, args) -> MimicEnvConfig:
    """Build environment configuration with rewards."""
    return MimicEnvConfig(
        max_episode_length=1000,
        reward_config={...},
        ...
    )

# Optional functions
def configure_robot_and_simulator(robot_cfg, simulator_cfg, args):
    """Customize robot and simulator settings."""
    robot_cfg.update_fields(contact_bodies=[...])

def agent_config(robot_cfg, env_cfg, args) -> PPOAgentConfig:
    """Build agent/network configuration."""
    return PPOAgentConfig(
        model=PPOModelConfig(...),
        batch_size=args.batch_size,
        ...
    )

def apply_inference_overrides(robot_cfg, simulator_cfg, env_cfg, agent_cfg, args):
    """Apply evaluation-time overrides (e.g., disable early termination)."""
    env_cfg.mimic_early_termination = None

Config Building Flow#

When you run train_agent.py, configs are built in this order:

Load robot config from factory (--robot-name)
Load simulator config (--simulator)
Call configure_robot_and_simulator() for customization
Build terrain_config(), scene_lib_config(), motion_lib_config()
Build env_config()
Build agent_config()
Apply CLI overrides (--overrides)
Save all to resolved_configs.pt

Robot Configurations#

Robot configs in protomotions/robot_configs/ define the robot:

@dataclass
class G1RobotConfig(RobotConfig):
    # Map common names to robot-specific body names
    common_naming_to_robot_body_names: Dict[str, List[str]] = field(
        default_factory=lambda: {
            "all_left_foot_bodies": ["left_ankle_roll_link"],
            "all_right_foot_bodies": ["right_ankle_roll_link"],
            "head_body_name": ["head"],
            "torso_body_name": ["torso_link"],
        }
    )

    # Asset configuration
    asset: RobotAssetConfig = field(default_factory=lambda: RobotAssetConfig(
        asset_file_name="mjcf/g1_bm_no_mesh_box_feet.xml",
        usd_asset_file_name="usd/g1_bm/g1_bm.usda",
    ))

    # PD control parameters per joint (regex patterns)
    control: ControlConfig = field(default_factory=lambda: ControlConfig(
        override_control_info={
            ".*_hip_(pitch|yaw)_joint": ControlInfo(
                stiffness=40.0, damping=8.0, effort_limit=88,
            ),
            ".*_knee_joint": ControlInfo(
                stiffness=99.0, damping=19.8, effort_limit=139,
            ),
        }
    ))

    # Per-simulator physics settings
    simulation_params: SimulatorParams = field(default_factory=lambda: SimulatorParams(
        isaacgym=IsaacGymSimParams(fps=100, decimation=2),
        newton=NewtonSimParams(fps=200, decimation=4),
    ))

Using Configurations#

Basic Usage#

python protomotions/train_agent.py \
    --robot-name g1 \
    --simulator isaacgym \
    --experiment-path examples/experiments/mimic/mlp.py \
    --experiment-name my_experiment \
    --motion-file path/to/motions.pt \
    --num-envs 4096 \
    --batch-size 16384

CLI Overrides#

Override nested config values with --overrides:

--overrides "agent.num_mini_epochs=4" "env.max_episode_length=500"

# Override reward weights
--overrides "env.reward_config.contact_match_rew.weight=0.0"

# Disable domain randomization (NOTE: "True" not "true")
--overrides "robot.asset.self_collisions=True"

Saved Configurations#

All configurations are saved for reproducibility:

results/<experiment_name>/
├── config.yaml              # CLI arguments + wandb_id
├── resolved_configs.pt      # Full config objects (pickled) - primary
├── resolved_configs.yaml    # Human-readable (best-effort)
├── experiment_config.py     # Copy of experiment file
└── resolved_configs_inference.pt  # Configs with eval overrides

resolved_configs.pt is the primary source of truth. It uses pickle to handle complex types (Union, nested dataclasses, torch.Tensor) that YAML cannot represent.

Warning

Do NOT modify resolved_configs.yaml files. They are generated for human readability only—the actual source of truth is the .pt file.

To modify configurations:

Small changes: Use --overrides on the command line
Large changes: Use --create-config-only to generate new configs, then copy the newly generated .pt file to your checkpoint directory

Resume Behavior#

Warning

Resume uses exact saved configs. CLI overrides are ignored during resume.

If you need to change configs, start a new experiment with --experiment-name.

Training modes:

Fresh start: New experiment name → build configs from experiment file
Resume: Same experiment name with existing checkpoint → load from resolved_configs.pt
Warm start: --checkpoint <path> with new experiment name → new configs, old weights

Use Old Checkpoints with New Configs or Code Changes#

Generate configs without training (useful for migrating old checkpoints):

python protomotions/train_agent.py \
    --robot-name g1 --simulator isaacgym \
    --experiment-path examples/experiments/mimic/mlp.py \
    --experiment-name migrated_experiment \
    --motion-file /path/to/motion.pt \
    --num-envs 4096 --batch-size 16384 \
    --create-config-only

When code or config classes change, regenerate the resolved configs for public artifacts instead of adding hidden inference-time migration hooks.

Component Configuration (Observations, Rewards, Terminations)#

Components use MdpComponent to bind pure tensor kernels to context paths:

from protomotions.envs.context_views import EnvContext
from protomotions.envs.mdp_component import MdpComponent
from protomotions.envs.rewards import compute_gt_rew, compute_action_smoothness

reward_components = {
    "gt_rew": MdpComponent(
        compute_func=compute_gt_rew,                      # Pure tensor function
        dynamic_vars={                                  # Map params to context paths
            "current_rigid_body_pos": EnvContext.current.rigid_body_pos,
            "ref_rigid_body_pos": EnvContext.mimic.ref_state.rigid_body_pos,
        },
        static_params={"weight": 0.5, "coefficient": -100.0},  # Static parameters
    ),
    "action_smoothness": MdpComponent(
        compute_func=compute_action_smoothness,
        dynamic_vars={
            "current_processed_action": EnvContext.current_processed_action,
            "previous_processed_action": EnvContext.previous_processed_action,
        },
        static_params={"weight": -0.02},
    ),
}

Key features:

Type-safe bindings: IDE autocomplete for context paths (EnvContext.current.rigid_body_pos)
Explicit dependencies: Bindings show exactly what data each kernel needs
Pure kernels: Functions take tensors, return tensors - easy to test
ONNX-ready: Bindings map directly to ONNX inputs

Context paths provide dual access:

Class access (for config): EnvContext.current.rigid_body_pos → FieldPath object
Instance access (at runtime): ctx.current.rigid_body_pos → Tensor value

This design provides type safety, IDE autocomplete, and makes dependencies explicit.

Agent/Model Configuration#

Network architectures are composed through configs:

from protomotions.agents.ppo.config import PPOActorConfig, PPOModelConfig
from protomotions.agents.common.config import MLPWithConcatConfig, MLPLayerConfig

actor_config = PPOActorConfig(
    num_out=robot_config.kinematic_info.num_dofs,
    actor_logstd=-2.9,
    in_keys=["max_coords_obs", "mimic_target_poses"],
    mu_model=MLPWithConcatConfig(
        in_keys=["max_coords_obs", "mimic_target_poses"],
        normalize_obs=True,
        layers=[MLPLayerConfig(units=1024, activation="relu") for _ in range(6)],
        output_activation="tanh",
    ),
)

The in_keys/out_keys system connects observations to network inputs, and connects different network layers/modules. TensorDict is used to handle the data flow, which also make ONNX export easier.

Debugging Tips#

Ask AI assistants: To understand the meaning of config fields, you can ask your favorite AI coding assistant.

Next Steps#

Developer Tips - Quality of life tips
See examples/experiments/ for more examples