boosters.GBDTConfig#

class boosters.GBDTConfig#

Bases: object

Main configuration for GBDT model.

This is the primary configuration class for gradient boosted decision trees. All parameters are flat (no nested config objects) matching the core Rust API.

# Arguments

  • n_estimators - Number of boosting rounds (trees to train). Default: 100.

  • learning_rate - Step size shrinkage (0.01 - 0.3 typical). Default: 0.3.

  • objective - Loss function for training. Default: Objective.Squared().

  • metric - Evaluation metric. None uses objective’s default.

  • growth_strategy - Tree growth strategy. Default: GrowthStrategy.Depthwise.

  • max_depth - Maximum tree depth (only for depthwise). Default: 6.

  • n_leaves - Maximum leaves (only for leafwise). Default: 31.

  • max_onehot_cats - Max categories for one-hot encoding. Default: 4.

  • l1 - L1 regularization on leaf weights. Default: 0.0.

  • l2 - L2 regularization on leaf weights. Default: 1.0.

  • min_gain_to_split - Minimum gain required to make a split. Default: 0.0.

  • min_child_weight - Minimum sum of hessians in a leaf. Default: 1.0.

  • min_samples_leaf - Minimum samples in a leaf. Default: 1.

  • subsample - Row subsampling ratio per tree. Default: 1.0.

  • colsample_bytree - Column subsampling per tree. Default: 1.0.

  • colsample_bylevel - Column subsampling per level. Default: 1.0.

  • linear_leaves - Enable linear models in leaves (experimental). Default: False.

  • linear_l2 - L2 regularization for linear coefficients. Default: 0.01.

  • linear_l1 - L1 regularization for linear coefficients. Default: 0.0.

  • early_stopping_rounds - Stop if no improvement for this many rounds.

  • seed - Random seed for reproducibility. Default: 42.

# Example (Python)

```text config = GBDTConfig(

n_estimators=500, learning_rate=0.1, objective=Objective.logistic(), max_depth=6, l2=1.0,

)#

classmethod __new__(*args, **kwargs)#
binning_sample_cnt#

Number of samples for computing bin boundaries (for large datasets).

cache_size#
Type:

Histogram cache size (number of slots). Default

colsample_bylevel#

Column subsampling ratio per level.

colsample_bytree#

Column subsampling ratio per tree.

early_stopping_rounds#

Early stopping rounds (None = disabled).

enable_bundling#

true. Bundles sparse/one-hot features to reduce memory and speed up training.

Type:

Enable exclusive feature bundling for sparse data. Default

growth_strategy#

Growth strategy for tree building.

l1#

L1 regularization on leaf weights.

l2#

L2 regularization on leaf weights.

learning_rate#

Learning rate (step size shrinkage).

linear_coefficient_threshold#

Threshold for pruning small coefficients.

linear_l1#

L1 regularization for linear coefficients.

linear_l2#

L2 regularization for linear coefficients.

linear_leaves#

Enable linear models in leaves.

linear_max_features#

Maximum features in linear model per leaf.

linear_max_iterations#

Maximum coordinate descent iterations for linear leaves.

linear_min_samples#

Minimum samples required to fit linear model in leaf.

linear_skip_first_n_trees#

Number of initial trees to skip linear leaf fitting. Default is 1 (first tree has homogeneous gradients). Set to 0 to enable from first tree.

linear_tolerance#

Convergence tolerance for linear leaves.

linear_use_global_features#

Use global features instead of path features for linear models. When true, uses the top-k most frequently split features for all leaves. This can improve extrapolation by ensuring important features are always included.

max_bins#

Maximum bins per feature for binning (1-256).

max_categorical_cardinality#

Max cardinality to auto-detect as categorical. Features with ≤ this many unique integer values may be treated as categorical.

max_depth#

Maximum depth of tree (for depthwise growth).

max_onehot_cats#

Maximum categories for one-hot encoding categorical splits.

metric#

Get the evaluation metric (or None).

min_child_weight#

Minimum sum of hessians required in a leaf.

min_gain_to_split#

Minimum gain required to make a split.

min_samples_leaf#

Minimum number of samples required in a leaf.

n_estimators#

Number of boosting rounds.

n_leaves#

Maximum number of leaves (for leafwise growth).

objective#

Get the objective function.

seed#

Random seed.

sparsity_threshold#

Sparsity threshold (fraction of zeros to use sparse storage). Features with density ≤ (1 - threshold) are considered sparse.

subsample#

Row subsampling ratio per tree.

verbosity#

Verbosity level for training output.