protomotions.utils.export_utils module#

Utilities for exporting trained models to ONNX format.

This module provides functions to export TensorDict-based models to ONNX format using torch.onnx.export. The exported models can be used for deployment and inference in production environments.

Key Functions:
  • export_onnx: Export a TensorDictModule to ONNX format

  • export_ppo_model: Export a trained PPO model to ONNX

  • export_with_action_processing: Export model with baked-in action processing

class protomotions.utils.export_utils.ONNXExportWrapper(module, in_keys, batch_size)[source]#

Bases: <Mock object at 0x7fbf6ba804d0>[]

Wrapper for TensorDictModule that accepts positional args for ONNX export.

TensorDictModules expect a TensorDict argument, but torch.onnx.export uses positional tensor inputs. This wrapper bridges the gap.

__init__(module, in_keys, batch_size)[source]#
forward(*args)[source]#

Forward that reconstructs TensorDict from positional args.

class protomotions.utils.export_utils.ActionProcessingModule(
control_type,
clamp_value,
pd_action_offset=None,
pd_action_scale=None,
torque_limits=None,
stiffness=None,
damping=None,
)[source]#

Bases: <Mock object at 0x7fbf6bacc7d0>[]

Standalone action processing module for ONNX export.

This module takes raw actions and applies clamping and PD/torque transformations. It’s designed to be exported separately from the policy model.

For PD control modes (BUILT_IN_PD, PROPORTIONAL):

output = pd_action_offset + pd_action_scale * clamp(action)

For TORQUE control mode:

output = clamp(action) * torque_limits

Parameters:
  • control_type (ControlType) – Control type enum (BUILT_IN_PD, PROPORTIONAL, or TORQUE).

  • clamp_value (float) – Value to clamp actions to [-clamp_value, clamp_value].

  • pd_action_offset (MockTensor | None) – Offset for PD target computation (for PD modes).

  • pd_action_scale (MockTensor | None) – Scale for PD target computation (for PD modes).

  • torque_limits (MockTensor | None) – Torque limits for TORQUE mode (required for TORQUE mode).

  • stiffness (MockTensor | None) – Per-DOF stiffness values for PD control (optional).

  • damping (MockTensor | None) – Per-DOF damping values for PD control (optional).

__init__(
control_type,
clamp_value,
pd_action_offset=None,
pd_action_scale=None,
torque_limits=None,
stiffness=None,
damping=None,
)[source]#
forward(action)[source]#

Process raw action into PD targets or torques.

Parameters:

action (MockTensor) – Raw action tensor [batch_size, num_actions]

Returns:

Tuple of (actions, targets, stiffness_targets, damping_targets). Each tensor has shape [batch_size, num_actions].

Return type:

tuple[MockTensor, MockTensor, MockTensor, MockTensor]

class protomotions.utils.export_utils.ActionProcessingONNXWrapper(
module,
in_keys,
batch_size,
action_key,
control_type,
clamp_value,
pd_action_offset=None,
pd_action_scale=None,
torque_limits=None,
)[source]#

Bases: <Mock object at 0x7fbf6bd8ea10>[]

Wrapper that includes action post-processing in the ONNX export.

This wrapper combines the policy model with action clamping and PD/torque transformations, producing final targets that can be directly applied by the simulator without further processing.

For PD control modes (BUILT_IN_PD, PROPORTIONAL):

output = pd_action_offset + pd_action_scale * clamp(action)

For TORQUE control mode:

output = clamp(action) * torque_limits

Parameters:
  • module (tensordict.nn.TensorDictModuleBase) – TensorDictModule (policy) to wrap.

  • in_keys (list) – List of input keys for the module.

  • batch_size (int) – Batch size for TensorDict reconstruction.

  • action_key (str) – Key in module output containing the action (default: “action”).

  • control_type (ControlType) – Control type enum (BUILT_IN_PD, PROPORTIONAL, or TORQUE).

  • clamp_value (float) – Value to clamp actions to [-clamp_value, clamp_value].

  • pd_action_offset (MockTensor | None) – Offset for PD target computation (for PD modes).

  • pd_action_scale (MockTensor | None) – Scale for PD target computation (for PD modes).

  • torque_limits (MockTensor | None) – Torque limits for TORQUE mode scaling.

out_keys#

Output keys for the wrapper (always [“pd_targets”] or [“torque_targets”]).

__init__(
module,
in_keys,
batch_size,
action_key,
control_type,
clamp_value,
pd_action_offset=None,
pd_action_scale=None,
torque_limits=None,
)[source]#
forward(*args)[source]#

Forward pass: model -> action_processor.

protomotions.utils.export_utils.export_action_processing(
control_type,
clamp_value,
pd_action_offset,
pd_action_scale,
torque_limits,
num_actions,
path,
batch_size=1,
meta=None,
validate=True,
opset_version=17,
)[source]#

Export action processing module to ONNX.

Creates a standalone ONNX model that takes raw actions and outputs PD targets or torques.

Parameters:
  • control_type (ControlType) – Control type enum (BUILT_IN_PD, PROPORTIONAL, or TORQUE).

  • clamp_value (float) – Value to clamp actions to [-clamp_value, clamp_value].

  • pd_action_offset (MockTensor) – Offset for PD target computation.

  • pd_action_scale (MockTensor) – Scale for PD target computation.

  • torque_limits (MockTensor) – Torque limits for TORQUE mode.

  • num_actions (int) – Number of action dimensions.

  • path (str) – Path to save the ONNX model (must end with .onnx).

  • batch_size (int) – Batch size for sample input (default: 1).

  • meta (Dict[str, Any] | None) – Optional additional metadata to save.

  • validate (bool) – If True, validates the exported model with onnxruntime.

  • opset_version (int) – ONNX opset version to use (default: 17).

protomotions.utils.export_utils.export_onnx(
module,
td,
path,
meta=None,
validate=True,
opset_version=17,
)[source]#

Export a TensorDictModule to ONNX format.

Uses torch.onnx.export to export the module. Creates a wrapper that converts between TensorDict and positional tensor inputs for ONNX compatibility.

Parameters:
  • module (tensordict.nn.TensorDictModuleBase) – TensorDictModule to export.

  • td (MockTensorDict) – Sample TensorDict input (used for tracing).

  • path (str) – Path to save the ONNX model (must end with .onnx).

  • meta (Dict[str, Any] | None) – Optional additional metadata to save.

  • validate (bool) – If True, validates the exported model with onnxruntime.

  • opset_version (int) – ONNX opset version to use (default: 17).

Raises:

ValueError – If path doesn’t end with .onnx.

Example

>>> from protomotions.agents.ppo.model import PPOModel
>>> from tensordict import TensorDict
>>> model = PPOModel(config)
>>> sample_input = TensorDict({"obs": torch.randn(1, 128)}, batch_size=1)
>>> export_onnx(model, sample_input, "policy.onnx")
protomotions.utils.export_utils.export_with_action_processing(
module,
td,
path,
action_key,
control_type,
clamp_value,
pd_action_offset=None,
pd_action_scale=None,
torque_limits=None,
meta=None,
validate=True,
opset_version=17,
)[source]#

Export a TensorDictModule to ONNX with baked-in action processing.

This function wraps the model with action clamping and PD/torque transformations, producing an ONNX model that outputs final targets (PD targets or torques) ready to be applied directly by the simulator.

For PD control modes:

output = pd_action_offset + pd_action_scale * clamp(action, -clamp_value, clamp_value)

For TORQUE control mode:

output = clamp(action, -clamp_value, clamp_value) * torque_limits

Parameters:
  • module (tensordict.nn.TensorDictModuleBase) – TensorDictModule (policy) to export.

  • td (MockTensorDict) – Sample TensorDict input (used for tracing).

  • path (str) – Path to save the ONNX model (must end with .onnx).

  • action_key (str) – Key in module output containing the action (e.g., “action”, “mean_action”).

  • control_type (ControlType) – Control type enum (BUILT_IN_PD, PROPORTIONAL, or TORQUE).

  • clamp_value (float) – Value to clamp actions to [-clamp_value, clamp_value].

  • pd_action_offset (MockTensor | None) – Offset tensor for PD target computation (required for PD modes).

  • pd_action_scale (MockTensor | None) – Scale tensor for PD target computation (required for PD modes).

  • torque_limits (MockTensor | None) – Torque limits tensor for TORQUE mode (required for TORQUE mode).

  • meta (Dict[str, Any] | None) – Optional additional metadata to save.

  • validate (bool) – If True, validates the exported model with onnxruntime.

  • opset_version (int) – ONNX opset version to use (default: 17).

Raises:

ValueError – If path doesn’t end with .onnx or required parameters are missing.

Example

>>> from protomotions.agents.ppo.model import PPOModel
>>> model = PPOModel(config)
>>> sample_input = TensorDict({"obs": torch.randn(1, 128)}, batch_size=1)
>>> export_with_action_processing(
...     model._actor,
...     sample_input,
...     "policy_with_actions.onnx",
...     action_key="mean_action",
...     control_type=ControlType.BUILT_IN_PD,
...     clamp_value=1.0,
...     pd_action_offset=torch.zeros(29),
...     pd_action_scale=torch.ones(29),
... )
protomotions.utils.export_utils.export_ppo_actor(actor, sample_obs, path, validate=True)[source]#

Export a PPO actor’s mu network to ONNX.

Exports the mean network (mu) of a PPO actor, which is the core policy network without the distribution layer. Uses real observations from the environment to ensure proper tracing.

Parameters:
  • actor – PPOActor instance to export.

  • sample_obs (Dict[str, MockTensor]) – Sample observation dict from environment (via agent.get_obs()).

  • path (str) – Path to save the ONNX model.

  • validate (bool) – If True, validates the exported model.

Example

>>> # Get real observations from environment
>>> env.reset()
>>> sample_obs = agent.get_obs()
>>> export_ppo_actor(agent.model._actor, sample_obs, "ppo_actor.onnx")
protomotions.utils.export_utils.export_ppo_critic(critic, sample_obs, path, validate=True)[source]#

Export a PPO critic network to ONNX.

Uses real observations from the environment to ensure proper tracing.

Parameters:
  • critic – PPO critic (MultiHeadedMLP) instance to export.

  • sample_obs (Dict[str, MockTensor]) – Sample observation dict from environment (via agent.get_obs()).

  • path (str) – Path to save the ONNX model.

  • validate (bool) – If True, validates the exported model.

Example

>>> # Get real observations from environment
>>> env.reset()
>>> sample_obs = agent.get_obs()
>>> export_ppo_critic(agent.model._critic, sample_obs, "ppo_critic.onnx")
protomotions.utils.export_utils.export_ppo_model(model, sample_obs, output_dir, validate=True)[source]#

Export a complete PPO model (actor and critic) to ONNX.

Exports both the actor and critic networks to separate ONNX files in the specified directory.

Parameters:
  • model – PPOModel instance to export.

  • sample_obs (Dict[str, MockTensor]) – Sample observation dict for tracing.

  • output_dir (str) – Directory to save the ONNX models.

  • validate (bool) – If True, validates the exported models.

Returns:

Dict with paths to exported files.

Example

>>> model = trained_agent.model
>>> sample_obs = {"obs": torch.randn(1, 128)}
>>> paths = export_ppo_model(model, sample_obs, "exported_models/")
>>> print(paths)
{'actor': 'exported_models/actor.onnx', 'critic': 'exported_models/critic.onnx'}
class protomotions.utils.export_utils.UnifiedPipelineModule(
observation_module,
policy_module,
action_processing_module,
policy_in_keys,
policy_action_key='mean_action',
passthrough_keys=None,
)[source]#

Bases: <Mock object at 0x7fbf6b928410>[]

Unified module that combines observations + policy + action processing.

This module takes raw context tensors as inputs and returns raw actions, post-processed actions (PD targets or torques), and stiffness/damping targets.

Pipeline: Context -> Observations -> Policy -> Action Processing

Inputs are split into two groups: 1. Context tensors for observation computation (from observation_module.get_input_keys()) 2. Additional tensors passed directly to policy (e.g., historical observations)

Outputs:
  • actions: Raw actions from the policy (used by env.step())

  • joint_pos_targets: Actions after clamping and PD/torque transform

  • stiffness_targets: Per-DOF stiffness values (constant, broadcast to batch)

  • damping_targets: Per-DOF damping values (constant, broadcast to batch)

Parameters:
  • observation_module (ObservationExportModule) – ObservationExportModule for computing observations

  • policy_module (<Mock object at 0x7fbf6b9d0a10>[]) – Policy model (actor) that takes observations and returns actions

  • action_processing_module (ActionProcessingModule) – ActionProcessingModule for post-processing

  • policy_in_keys (list) – List of observation keys the policy expects as input

  • policy_action_key (str) – Key for the action output from policy (e.g., ‘mean_action’)

  • passthrough_keys (list) – Additional keys passed directly to policy (not from observations)

__init__(
observation_module,
policy_module,
action_processing_module,
policy_in_keys,
policy_action_key='mean_action',
passthrough_keys=None,
)[source]#
get_all_input_keys()[source]#

Get all input keys: observation context + passthrough.

forward(*all_tensors)[source]#

Run the full pipeline from context to actions.

Parameters:

*all_tensors – Input tensors in order of: 1. observation_module.get_input_keys() 2. passthrough_keys

Returns:

Tuple of (actions, joint_pos_targets, stiffness_targets, damping_targets).

Return type:

tuple

protomotions.utils.export_utils.export_unified_pipeline(
observation_configs,
sample_context,
policy_module,
policy_in_keys,
policy_action_key,
action_processing_module,
path,
device,
robot_config,
passthrough_obs=None,
validate=True,
meta=None,
)[source]#

Export the complete pipeline (context -> actions) as a single ONNX model.

Creates a unified model that: 1. Computes observations from context 2. Runs the policy to get raw actions (using both computed obs and passthrough inputs) 3. Applies action processing (clamp + PD/torque transform)

Outputs four tensors: - actions: Raw actions from policy (used by env.step()) - joint_pos_targets: Actions after clamping and PD/torque transform - stiffness_targets: Per-DOF stiffness values (constant) - damping_targets: Per-DOF damping values (constant)

Also generates a YAML configuration file for isaac-deploy.

Parameters:
  • observation_configs (Dict[str, Any]) – Dict of observation component configurations.

  • sample_context (Dict[str, Any]) – Sample context dict for tracing.

  • policy_module (<Mock object at 0x7fbf6b9c1390>[]) – Policy model (actor) that takes observations.

  • policy_in_keys (list) – List of observation keys the policy expects.

  • policy_action_key (str) – Key for action output from policy.

  • action_processing_module (ActionProcessingModule) – ActionProcessingModule instance.

  • path (str) – Path to save the ONNX model.

  • device (<Mock object at 0x7fbf6b9c2310>[]) – Device for tensor operations.

  • robot_config (Any) – Robot configuration with kinematic_info and control.

  • passthrough_obs (Dict[str, MockTensor] | None) – Dict of additional observations passed directly to policy (e.g., historical observations). Keys are obs names, values are sample tensors.

  • validate (bool) – If True, validates with onnxruntime.

  • meta (Dict[str, Any] | None) – Optional metadata to include.

Returns:

Path to the exported ONNX model.

Return type:

str

class protomotions.utils.export_utils.ObservationExportModule(observation_configs, sample_context, device)[source]#

Bases: <Mock object at 0x7fbf6b9c5f90>[]

Module that wraps observation functions for ONNX export.

This module takes raw context tensors as inputs and computes observations by calling the configured observation functions. It’s designed to be exported to ONNX for deployment.

The module resolves all variable mappings at construction time and stores them as a mapping from input tensor names to function argument names.

Parameters:
  • observation_configs (Dict[str, Any]) – Dict of observation component configurations.

  • sample_context (Dict[str, Any]) – Sample context dict to determine input shapes and resolve mappings.

  • device (<Mock object at 0x7fbf6bbe6ed0>[]) – Device for tensor operations.

Example

>>> from protomotions.envs.obs.mimic_obs_functions import mimic_target_poses_max_coords_factory
>>> configs = {"mimic_target_poses": mimic_target_poses_max_coords_factory()}
>>> context = env._get_global_context()
>>> module = ObservationExportModule(configs, context, device)
>>> # Export to ONNX
>>> export_observations(module, context, "observations.onnx")
__init__(
observation_configs,
sample_context,
device,
)[source]#
get_input_keys()[source]#

Get ordered list of input context keys needed.

get_output_keys()[source]#

Get ordered list of output observation names.

forward(*args)[source]#

Compute all observations from input tensors.

Parameters:

*args – Input tensors in the order of get_input_keys().

Returns:

Tuple of observation tensors in the order of get_output_keys().

Return type:

tuple

protomotions.utils.export_utils.export_observations(
observation_configs,
sample_context,
path,
device,
validate=True,
meta=None,
)[source]#

Export observation computation to ONNX format.

Creates an ObservationExportModule from the observation configs and exports it to ONNX. The exported model takes raw context tensors as inputs and produces observation tensors as outputs.

Parameters:
  • observation_configs (Dict[str, Any]) – Dict of observation component configurations.

  • sample_context (Dict[str, Any]) – Sample context dict for tracing and shape inference.

  • path (str) – Path to save the ONNX model.

  • device (<Mock object at 0x7fbf6b9f85d0>[]) – Device for tensor operations.

  • validate (bool) – If True, validates with onnxruntime.

  • meta (Dict[str, Any] | None) – Optional metadata to include in the JSON sidecar.

Returns:

Path to the exported ONNX model.

Return type:

str

Example

>>> configs = env.config.observation_components
>>> context = env._get_global_context()
>>> export_observations(configs, context, "observations.onnx", device)