boosters.sklearn.GBDTRegressor#
- class boosters.sklearn.GBDTRegressor[source]#
Bases:
_GBDTEstimatorBase,RegressorMixinGradient Boosted Decision Tree Regressor.
A sklearn-compatible wrapper around GBDTModel for regression.
- Parameters:
n_estimators (int, default=100) – Number of boosting rounds.
learning_rate (float, default=0.1) – Learning rate shrinks the contribution of each tree.
max_depth (int, default=6) – Maximum depth of each tree.
min_child_weight (float, default=1.0) – Minimum sum of instance weight (hessian) in a child node.
max_leaves (int, default=31) – Maximum number of leaves per tree.
grow_strategy (GrowthStrategy, default=GrowthStrategy.Depthwise) – Tree growing strategy.
colsample_bytree (float, default=1.0) – Subsample ratio of columns for each tree.
subsample (float, default=1.0) – Subsample ratio of training instances.
gamma (float, default=0.0) – Minimum loss reduction required for split.
reg_alpha (float, default=0.0) – L1 regularization on weights.
reg_lambda (float, default=1.0) – L2 regularization on weights.
early_stopping_rounds (int or None, default=None) – Stop if no improvement for this many rounds.
seed (int, default=42) – Random seed.
n_threads (int, default=0) – Number of threads (0 = auto).
objective (Objective or None, default=None) – Loss function. Must be a regression objective (e.g., Objective.squared()). If None, uses Objective.squared().
metric (Metric or None, default=None) – Evaluation metric. If None, uses Metric.rmse().
Attributes
----------
model (GBDTModel) – The fitted core model.
n_features_in (int) – Number of features seen during fit.
feature_importances (ndarray of shape (n_features,)) – Feature importance scores (gain-based).
- __init__(n_estimators=100, learning_rate=0.1, max_depth=6, min_child_weight=1.0, max_leaves=31, grow_strategy=GrowthStrategy.Depthwise, colsample_bytree=1.0, subsample=1.0, gamma=0.0, reg_alpha=0.0, reg_lambda=1.0, early_stopping_rounds=None, seed=42, n_threads=0, verbose=1, objective=None, metric=None)#
- Parameters:
n_estimators (int)
learning_rate (float)
max_depth (int)
min_child_weight (float)
max_leaves (int)
grow_strategy (GrowthStrategy)
colsample_bytree (float)
subsample (float)
gamma (float)
reg_alpha (float)
reg_lambda (float)
early_stopping_rounds (int | None)
seed (int)
n_threads (int)
verbose (int)
objective (Objective | None)
metric (Metric | None)
- Return type:
None
- property feature_importances_: ndarray[tuple[Any, ...], dtype[float32]]#
Return feature importances (gain-based).
- fit(X, y, eval_set=None, sample_weight=None)#
Fit the estimator.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training input samples.
y (array-like of shape (n_samples,)) – Target values.
eval_set (tuple of (X, y), optional) – Validation set as (X_val, y_val) tuple.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.
Returns
-------
self – Fitted estimator.
- Return type:
Self
- get_feature_importance(importance_type=ImportanceType.Gain)#
Get feature importance scores.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequestencapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)#
Get parameters for this estimator.
- predict(X)#
Predict using the fitted model.
- score(X, y, sample_weight=None)#
Return coefficient of determination on test data.
The coefficient of determination, \(R^2\), is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns:
score – \(R^2\) of
self.predict(X)w.r.t. y.- Return type:
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score(). This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- set_fit_request(*, eval_set='$UNCHANGED$', sample_weight='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (GBDTRegressor)
- Returns:
self – The updated object.
- Return type:
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- set_score_request(*, sample_weight='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (GBDTRegressor)
- Returns:
self – The updated object.
- Return type: