protomotions.agents.masked_mimic.config module#

Configuration classes for MaskedMimic agent.

MaskedMimic uses a VAE-based architecture for versatile motion imitation with masked conditioning and latent space learning.

class protomotions.agents.masked_mimic.config.KLDScheduleConfig(
init_kld_coeff=0.0001,
end_kld_coeff=0.01,
start_epoch=3000,
end_epoch=6000,
)[source]#

Bases: object

Configuration for KL divergence scheduling in VAE training.

Attributes:

init_kld_coeff: Initial KL divergence coefficient. end_kld_coeff: Final KL divergence coefficient. start_epoch: Epoch to start KLD coefficient annealing. end_epoch: Epoch to end KLD coefficient annealing.

init_kld_coeff: float = 0.0001#
end_kld_coeff: float = 0.01#
start_epoch: int = 3000#
end_epoch: int = 6000#
__init__(
init_kld_coeff=0.0001,
end_kld_coeff=0.01,
start_epoch=3000,
end_epoch=6000,
)#
class protomotions.agents.masked_mimic.config.VaeNoiseType(value)[source]#

Bases: Enum

Type of noise for VAE sampling.

NORMAL = 'normal'#
UNIFORM = 'uniform'#
ZEROS = 'zeros'#
classmethod from_str(value)[source]#

Create enum from string, case-insensitive.

class protomotions.agents.masked_mimic.config.VaeConfig(
kld_schedule=<factory>,
vae_latent_dim=64,
vae_noise_type=VaeNoiseType.NORMAL,
)[source]#

Bases: object

Configuration for VAE-specific parameters.

Attributes:

kld_schedule: KL divergence annealing schedule. vae_latent_dim: Dimension of VAE latent space. vae_noise_type: Type of noise for latent sampling: normal, uniform, or zeros.

kld_schedule: KLDScheduleConfig#
vae_latent_dim: int = 64#
vae_noise_type: VaeNoiseType = 'normal'#
__init__(
kld_schedule=<factory>,
vae_latent_dim=64,
vae_noise_type=VaeNoiseType.NORMAL,
)#
class protomotions.agents.masked_mimic.config.FeedForwardModelConfig(
_target_='protomotions.agents.masked_mimic.model.FeedForwardModel',
in_keys=<factory>,
out_keys=<factory>,
trunk=<factory>,
)[source]#

Bases: BaseModelConfig

Configuration for FeedForwardModel (non-VAE variant).

Attributes:

in_keys: Input keys. out_keys: Output keys. trunk: Main trunk network for forward pass.

trunk: ModuleContainerConfig#
__init__(
_target_='protomotions.agents.masked_mimic.model.FeedForwardModel',
in_keys=<factory>,
out_keys=<factory>,
trunk=<factory>,
)#
class protomotions.agents.masked_mimic.config.MaskedMimicModelConfig(
_target_='protomotions.agents.masked_mimic.model.MaskedMimicModel',
in_keys=<factory>,
out_keys=<factory>,
encoder=<factory>,
prior=<factory>,
trunk=<factory>,
vae=<factory>,
optimizer=<factory>,
)[source]#

Bases: BaseModelConfig

Configuration for MaskedMimic Model (VAE-based imitation learning).

Attributes:

in_keys: Input keys. out_keys: Output keys. encoder: VAE encoder network (maps observations to latent). prior: Prior network for latent distribution. trunk: Decoder trunk network (latent to actions). vae: VAE configuration (latent dim, KLD schedule, etc). optimizer: Optimizer settings for model training.

encoder: ModuleContainerConfig#
prior: ModuleContainerConfig#
trunk: ModuleContainerConfig#
vae: VaeConfig#
optimizer: OptimizerConfig#
__init__(
_target_='protomotions.agents.masked_mimic.model.MaskedMimicModel',
in_keys=<factory>,
out_keys=<factory>,
encoder=<factory>,
prior=<factory>,
trunk=<factory>,
vae=<factory>,
optimizer=<factory>,
)#
class protomotions.agents.masked_mimic.config.MaskedMimicAgentConfig(
batch_size,
training_max_steps,
_target_='protomotions.agents.masked_mimic.agent.MaskedMimic',
model=<factory>,
num_steps=32,
gradient_clip_val=0.0,
fail_on_bad_grads=False,
check_grad_mag=True,
gamma=0.99,
bounds_loss_coef=0.0,
task_reward_w=1.0,
num_mini_epochs=1,
training_early_termination=None,
save_epoch_checkpoint_every=1000,
save_last_checkpoint_every=10,
max_episode_length_manager=None,
evaluator=<factory>,
normalize_rewards=True,
normalized_reward_clamp_value=5.0,
expert_model_path=None,
)[source]#

Bases: BaseAgentConfig

Main configuration class for MaskedMimic Agent.

Attributes:

batch_size: Training batch size. training_max_steps: Maximum training steps. model: Model configuration (VAE or FeedForward variant). num_steps: Environment steps per update. gradient_clip_val: Max gradient norm. 0=disabled. fail_on_bad_grads: Fail on NaN/Inf gradients. check_grad_mag: Log gradient magnitude. gamma: Discount factor. bounds_loss_coef: Action bounds loss. 0 for tanh outputs. task_reward_w: Task reward weight. num_mini_epochs: Mini-epochs per update. training_early_termination: Stop early at this step. None=disabled. save_epoch_checkpoint_every: Save epoch_xxx.ckpt every N epochs. save_last_checkpoint_every: Save last.ckpt every K epochs. max_episode_length_manager: Episode length curriculum. evaluator: Evaluation config. normalize_rewards: Normalize rewards. normalized_reward_clamp_value: Clamp normalized rewards to [-val, val]. expert_model_path: Path to pre-trained expert model checkpoint.

model: MaskedMimicModelConfig | FeedForwardModelConfig#
expert_model_path: str | None = None#
__init__(
batch_size,
training_max_steps,
_target_='protomotions.agents.masked_mimic.agent.MaskedMimic',
model=<factory>,
num_steps=32,
gradient_clip_val=0.0,
fail_on_bad_grads=False,
check_grad_mag=True,
gamma=0.99,
bounds_loss_coef=0.0,
task_reward_w=1.0,
num_mini_epochs=1,
training_early_termination=None,
save_epoch_checkpoint_every=1000,
save_last_checkpoint_every=10,
max_episode_length_manager=None,
evaluator=<factory>,
normalize_rewards=True,
normalized_reward_clamp_value=5.0,
expert_model_path=None,
)#