GaussianActor

GBRL model that produces the parameters of a Gaussian policy distribution (mean and log standard deviation). Used for continuous control tasks, especially in algorithms like SAC, with support for fixed or learnable standard deviations.

class gbrl.models.actor.GaussianActor(tree_struct: Dict, input_dim: int, output_dim: int, mu_optimizer: Dict, std_optimizer: Dict | None = None, log_std_init: float = -2, params: Dict = {}, bias: numpy.ndarray | float | None = None, verbose: int = 0, device: str = 'cpu')[source]

Bases: BaseGBT

GBRL model for an actor ensemble used in algorithms such as SAC. This model outputs the mean (mu) and log standard deviation (log_std) of a Gaussian distribution, allowing stochastic action selection.

step(observations: numpy.ndarray | torch.Tensor | None = None, mu_grads: numpy.ndarray | torch.Tensor | None = None, log_std_grads: numpy.ndarray | torch.Tensor | None = None, mu_grad_clip: float | None = None, log_std_grad_clip: float | None = None) None[source]

Performs a single boosting iteration.

Parameters:
  • observations (NumericalData) – Input observations.

  • mu_grads (Optional[NumericalData], optional) – Manually computed mean gradients.

  • log_std_grads (Optional[NumericalData], optional) – Manually computed log standard deviation gradients.

  • mu_grad_clip (Optional[float], optional) – Gradient clipping for mean. Defaults to None.

  • log_std_grad_clip (Optional[float], optional) – Gradient clipping for log standard deviation. Defaults to None.