BaseGBT

Abstract base class for all GBRL models. It defines the core API and shared logic for managing GBT learners, including training steps, gradient handling, SHAP value computation, device control, saving/loading, and visualization utilities.

class gbrl.models.base.BaseGBT[source]

Bases: ABC

Abstract base class for gradient boosting tree models.

This class defines the fundamental interface for all GBRL models and manages common functionality such as parameter storage, gradient tracking, and model serialization.

learner

The underlying GBTLearner instance.

grads

Stored gradients from the last backward pass.

params

Stored parameters (predictions) from the last forward pass.

input

Stored input from the last forward pass (when requires_grad=True).

copy() BaseGBT[source]

Creates a deep copy of the model instance.

Returns:

A new instance with copied parameters and state.

Return type:

BaseGBT

export_learner(filename: str, modelname: str | None = None) None[source]

Exports the model as a C header file for embedded deployment.

Parameters:
  • filename (str) – Absolute path and filename for the exported header. The .h extension will be added automatically.

  • modelname (Optional[str], optional) – Name to use for the model in the C code. Defaults to None (empty string).

Raises:

AssertionError – If learner is not initialized.

fit(*args, **kwargs) float | Tuple[float, ...][source]

Fits the model for multiple iterations (supervised learning mode).

This method performs batch training by fitting multiple trees sequentially on the provided data.

Returns:

Final loss value(s) per learner

averaged over all examples.

Return type:

Union[float, Tuple[float, …]]

Raises:

NotImplementedError – This method must be implemented by subclasses that support supervised learning.

get_device() str | Tuple[str, ...][source]

Gets the current computation device(s) for the model.

Returns:

Device string (‘cpu’ or ‘cuda’) per learner.

Returns a single string for single models or a tuple for multi-models.

Return type:

Union[str, Tuple[str, …]]

Raises:

AssertionError – If learner is not initialized.

get_grads() numpy.ndarray | torch.Tensor | Tuple[numpy.ndarray | torch.Tensor, ...] | None[source]

Gets a copy of the gradients from the last backward pass.

Returns:

Cloned/copied

gradients or None if no backward pass has been performed.

Return type:

Optional[Union[NumericalData, Tuple[NumericalData, …]]]

get_iteration() int | Tuple[int, ...][source]

Gets the current number of boosting iterations completed.

Returns:

Number of boosting iterations per learner.

Returns a single int for single models or a tuple for multi-models.

Return type:

Union[int, Tuple[int, …]]

Raises:

AssertionError – If learner is not initialized.

get_num_trees(*args, **kwargs) int | Tuple[int, ...][source]

Gets the total number of trees in the ensemble.

Returns:

Number of trees in the ensemble per learner.

Returns a single int for single models or a tuple for multi-models.

Return type:

Union[int, Tuple[int, …]]

Raises:

AssertionError – If learner is not initialized.

get_params() numpy.ndarray | torch.Tensor | None | Tuple[numpy.ndarray | torch.Tensor | None, ...][source]

Gets a copy of the model’s predicted parameters from the last forward pass.

Returns:

Cloned/copied

parameters or None if no forward pass has been performed.

Return type:

Optional[Union[Optional[NumericalData], Tuple[Optional[NumericalData], …]]]

get_schedule_learning_rates() float | Tuple[float, ...][source]

Gets current scheduled learning rate values for all optimizers.

For constant schedules, returns the initial learning rate unchanged. For linear schedules, returns the learning rate adjusted based on the number of trees in the ensemble relative to the total expected iterations.

Returns:

Current learning rate(s) per optimizer.

Return type:

Union[float, Tuple[float, …]]

Raises:

AssertionError – If learner is not initialized.

get_total_iterations() int[source]

Gets the total cumulative number of boosting iterations.

For actor-critic models with separate learners, this returns the sum of iterations across both actor and critic. For single models or shared architectures, this equals get_iteration().

Returns:

Total number of boosting iterations across all learners.

Return type:

int

Raises:

AssertionError – If learner is not initialized.

classmethod load_learner(load_name: str, device: str) BaseGBT[source]

Loads a model from disk.

Parameters:
  • load_name (str) – Full path to the saved model file.

  • device (str) – Device to load the model onto (‘cpu’ or ‘cuda’).

Returns:

Loaded model instance of the appropriate subclass.

Return type:

BaseGBT

plot_tree(tree_idx: int, filename: str, *args, **kwargs) None[source]

Visualizes a tree and saves it as a PNG image.

Note: Only works if GBRL was compiled with Graphviz support.

Parameters:
  • tree_idx (int) – Index of the tree to visualize.

  • filename (str) – Output filename for the PNG image (extension optional).

Raises:

AssertionError – If learner is not initialized.

print_tree(tree_idx: int, *args, **kwargs) None[source]

Prints detailed information about a specific tree to stdout.

Parameters:

tree_idx (int) – Index of the tree to print.

Raises:

AssertionError – If learner is not initialized.

save_learner(save_path: str) None[source]

Saves the model to disk.

Parameters:

save_path (str) – Absolute path and filename for saving the model. The .gbrl_model extension will be added automatically.

Raises:

AssertionError – If learner is not initialized.

set_bias(*args, **kwargs) None[source]

Sets the bias term for the GBRL model.

This method should be implemented by subclasses to set the initial bias value for predictions.

Raises:

NotImplementedError – This is an abstract method that must be implemented by subclasses.

set_device(device: str)[source]

Sets the computation device for the GBRL model.

Parameters:

device (str) – Target device, must be either ‘cpu’ or ‘cuda’.

Raises:

AssertionError – If device is not ‘cpu’ or ‘cuda’, or if learner is not initialized.

set_feature_weights(feature_weights: numpy.ndarray | torch.Tensor) None[source]

Sets per-feature importance weights for split selection.

Feature weights are used to scale the contribution of each feature when selecting the best split during tree construction. Higher weights give features more importance in the splitting process.

Parameters:

feature_weights (NumericalData) – Array of weights (one per feature). All weights must be >= 0.

Raises:

AssertionError – If learner is not initialized or weights are invalid.

shap(features: numpy.ndarray | torch.Tensor, *args, **kwargs) numpy.ndarray | Tuple[numpy.ndarray, numpy.ndarray][source]

Calculates SHAP values for the entire ensemble.

Implementation based on Linear TreeShap algorithm by Yu et al., 2023. Computes SHAP values sequentially for each tree and aggregates them. See: https://arxiv.org/pdf/2209.08192 and https://github.com/yupbank/linear_tree_shap

Parameters:

features (NumericalData) – Input features for SHAP computation.

Returns:

SHAP values with

shape [n_samples, n_features, n_outputs]. Returns a tuple of SHAP values for separate actor-critic models.

Return type:

Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]

Raises:

AssertionError – If learner is not initialized.

abstractmethod step(*args, **kwargs) None[source]

Performs a single boosting step by fitting one tree to gradients.

This method should be implemented by subclasses to add a new tree to the ensemble based on the computed gradients.

Raises:

NotImplementedError – This is an abstract method that must be implemented by subclasses.

tree_shap(tree_idx: int, features: numpy.ndarray | torch.Tensor, *args, **kwargs) numpy.ndarray | Tuple[numpy.ndarray, numpy.ndarray][source]

Calculates SHAP values for a single tree in the ensemble.

Implementation based on Linear TreeShap algorithm by Yu et al., 2023. See: https://arxiv.org/pdf/2209.08192 and https://github.com/yupbank/linear_tree_shap

Parameters:
  • tree_idx (int) – Index of the tree to compute SHAP values for.

  • features (NumericalData) – Input features for SHAP computation.

Returns:

SHAP values with

shape [n_samples, n_features, n_outputs]. Returns a tuple of SHAP values for separate actor-critic models.

Return type:

Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]

Raises:

AssertionError – If learner is not initialized.