ParametricActor Class

This class implements a GBT-based Parametric Actor for reinforcement learning. The ParametricActor outputs a single parameter per action dimension. This allows the ParametericActor to parameterized deterministic policies or discrete stochastic policies such as a Categorical Distribution.

class gbrl.ac_gbrl.ParametricActor(tree_struct: Dict, output_dim: int, policy_optimizer: Dict, gbrl_params: Dict = {}, bias: ndarray = None, verbose: int = 0, device: str = 'cpu')[source]

Bases: GBRL

get_num_trees() int[source]

Returns number of trees in the ensemble. :returns: int

classmethod load_model(load_name: str, device: str) ParametricActor[source]

Loads GBRL model from a file

Parameters:

load_name (str) – full path to file name

Returns:

loaded ActorCriticModel

Return type:

ParametricActor

step(observations: ndarray | Tensor, policy_grad_clip: float = None, policy_grad: ndarray | Tensor | None = None) None[source]

Performs a single boosting iteration.

Parameters:
  • observations (Union[np.ndarray, th.Tensor])

  • policy_grad_clip (float, optional) – . Defaults to None.

  • policy_grad (Optional[Union[np.ndarray, th.Tensor]], optional) – manually calculated gradients. Defaults to None.