GBTLearner
Concrete implementation of BaseLearner that wraps a single gradient boosting tree ensemble using the GBRL C++ backend. It supports training, prediction, saving/loading, and SHAP value computation.
- class gbrl.learners.gbt_learner.GBTLearner(input_dim: int, output_dim: int, tree_struct: Dict, optimizers: Dict | List, params: Dict, policy_dim: int | None = None, verbose: int = 0, device: str = 'cpu')[source]
Bases:
BaseLearnerGBTLearner is a gradient boosted tree learner that utilizes a C++ backend for efficient computation. It supports training, prediction, saving, loading, and SHAP value computation.
- distil(obs: numpy.ndarray, targets: numpy.ndarray, params: Dict, verbose: int = 0) Tuple[float, Dict][source]
Distills the model into a student model.
- Parameters:
obs (np.ndarray) – Input observations.
targets (np.ndarray) – Target values.
params (Dict) – Distillation parameters.
verbose (int, optional) – Verbosity level. Defaults to 0.
- Returns:
The final loss and updated parameters.
- Return type:
Tuple[float, Dict]
- export(filename: str, modelname: str | None = None) None[source]
Exports the model to a C header file.
- Parameters:
filename (str) – The filename to export the model to.
modelname (str, optional) – The name of the model in the C code. Defaults to None.
- fit(features: numpy.ndarray | torch.Tensor, targets: numpy.ndarray | torch.Tensor, iterations: int, shuffle: bool = True, loss_type: str = 'MultiRMSE') float[source]
Fits the model to the provided features and targets for a given number of iterations.
- Parameters:
features (NumericalData) – Input features.
targets (NumericalData) – Target values.
iterations (int) – Number of training iterations.
shuffle (bool, optional) – Whether to shuffle the data. Defaults to True.
loss_type (str, optional) – Type of loss function. Defaults to ‘MultiRMSE’.
- Returns:
The final loss value.
- Return type:
float
- get_bias() numpy.ndarray[source]
Returns the bias of the model.
- Returns:
The bias.
- Return type:
np.ndarray
- get_device() str[source]
Returns the device the model is running on.
- Returns:
The device.
- Return type:
str
- get_feature_weights() numpy.ndarray[source]
Returns the feature weights of the model.
- Returns:
The feature weights.
- Return type:
np.ndarray
- get_iteration() int[source]
Returns the current iteration number.
- Returns:
The current iteration number.
- Return type:
int
- get_num_trees() int[source]
Returns the total number of trees in the ensemble.
- Returns:
The total number of trees.
- Return type:
int
- get_schedule_learning_rates() numpy.ndarray | Tuple[numpy.ndarray, ...][source]
Returns the learning rates of the schedulers.
- Returns:
The learning rates.
- Return type:
Union[int, Tuple[int, int]]
- classmethod load(filename: str, device: str) GBTLearner[source]
Loads a GBTLearner model from a file.
- Parameters:
filename (str) – The filename to load the model from.
device (str) – The device to load the model onto.
- Returns:
The loaded GBTLearner instance.
- Return type:
- plot_tree(tree_idx: int, filename: str) None[source]
Plots the tree at the given index and saves it to a file.
- Parameters:
tree_idx (int) – The index of the tree to plot.
filename (str) – The filename to save the plot to.
- predict(inputs: numpy.ndarray | torch.Tensor, requires_grad: bool = True, start_idx: int | None = None, stop_idx: int | None = None, tensor: bool = True) numpy.ndarray | torch.Tensor[source]
Predicts the output for the given features.
- Parameters:
inputs (NumericalData) – Input features.
requires_grad (bool, optional) – Whether to compute gradients. Defaults to True.
start_idx (int, optional) – Start index for prediction. Defaults to 0.
stop_idx (int, optional) – Stop index for prediction. Defaults to None.
tensor (bool, optional) – Whether to return a tensor. Defaults to True.
- Returns:
The predicted output.
- Return type:
NumericalData
- print_tree(tree_idx: int) None[source]
Prints the tree at the given index.
- Parameters:
tree_idx (int) – The index of the tree to print.
- reset() None[source]
Resets the learner to its initial state, reinitializing the C++ model and optimizers.
- save(filename: str) None[source]
Saves the model to a file.
- Parameters:
filename (str) – The filename to save the model to.
- set_bias(bias: numpy.ndarray | torch.Tensor | float) None[source]
Sets the bias of the model.
- Parameters:
bias (Union[NumericalData, float]) – The bias value.
- set_device(device: str | torch.device) None[source]
Sets the device the model should run on.
- Parameters:
device (Union[str, th.device]) – The device to set.
- set_feature_weights(feature_weights: numpy.ndarray | torch.Tensor | float) None[source]
Sets the feature weights of the model.
- Parameters:
feature_weights (Union[NumericalData, float]) – The feature weights.
- shap(features: numpy.ndarray | torch.Tensor) numpy.ndarray[source]
Computes SHAP values for the entire ensemble.
Uses Linear tree shap for each tree in the ensemble (sequentially) Implementation based on - https://github.com/yupbank/linear_tree_shap See Linear TreeShap, Yu et al, 2023, https://arxiv.org/pdf/2209.08192 :param features: :type features: NumericalData
- Returns:
shap values
- Return type:
np.ndarray
- step(inputs: numpy.ndarray | torch.Tensor, grads: numpy.ndarray | torch.Tensor | Tuple[numpy.ndarray | torch.Tensor, ...]) None[source]
Performs a single gradient update step by adding a decision tree to the ensemble.
- Parameters:
inputs (NumericalData) – Input features (NumPy array or PyTorch tensor).
grads (NumericalData or Tuple[NumericalData, ...]) – Gradients for the update step.
- Returns:
None
- tree_shap(tree_idx: int, features: numpy.ndarray | torch.Tensor) numpy.ndarray[source]
Computes SHAP values for a single tree.
Implementation based on - https://github.com/yupbank/linear_tree_shap See Linear TreeShap, Yu et al, 2023, https://arxiv.org/pdf/2209.08192 :param tree_idx: tree index :type tree_idx: int :param features: :type features: NumericalData
- Returns:
shap values
- Return type:
np.ndarray