Gradient Boosting Classification Learner
Source:R/learner_gbm_classif_gbm.R
mlr_learners_classif.gbm.RdGradient Boosting Classification Algorithm.
Calls gbm::gbm() from gbm.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, gbm
Parameters
| Id | Type | Default | Levels | Range |
| distribution | character | bernoulli | bernoulli, adaboost, huberized, multinomial | - |
| n.trees | integer | 100 | \([1, \infty)\) | |
| interaction.depth | integer | 1 | \([1, \infty)\) | |
| n.minobsinnode | integer | 10 | \([1, \infty)\) | |
| shrinkage | numeric | 0.001 | \([0, \infty)\) | |
| bag.fraction | numeric | 0.5 | \([0, 1]\) | |
| train.fraction | numeric | 1 | \([0, 1]\) | |
| cv.folds | integer | 0 | \((-\infty, \infty)\) | |
| keep.data | logical | FALSE | TRUE, FALSE | - |
| verbose | logical | FALSE | TRUE, FALSE | - |
| n.cores | integer | 1 | \((-\infty, \infty)\) | |
| var.monotone | untyped | - | - |
Initial parameter values
keep.datais initialized toFALSEto save memory.n.coresis initialized to 1 to avoid conflicts with parallelization through future.
References
Friedman, H J (2002). “Stochastic gradient boosting.” Computational statistics & data analysis, 38(4), 367–378.
See also
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifGBM
Methods
Inherited methods
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()
Method importance()
The importance scores are extracted by gbm::relative.influence() from
the model.
Returns
Named numeric().
Examples
# Define the Learner
learner = lrn("classif.gbm")
print(learner)
#>
#> ── <LearnerClassifGBM> (classif.gbm): Gradient Boosting ────────────────────────
#> • Model: -
#> • Parameters: keep.data=FALSE, n.cores=1
#> • Packages: mlr3, mlr3extralearners, and gbm
#> • Predict Types: [response] and prob
#> • Feature Types: integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, missings, twoclass, and weights
#> • Other settings: use_weights = 'use', predict_raw = 'FALSE'
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
#> Distribution not specified, assuming bernoulli ...
print(learner$model)
#> gbm::gbm(formula = f, data = data, keep.data = FALSE, n.cores = 1L)
#> A gradient boosted model with bernoulli loss function.
#> 100 iterations were performed.
#> There were 60 predictors of which 41 had non-zero influence.
print(learner$importance())
#> V11 V12 V10 V51 V48 V9 V28
#> 17.9365418 11.9248695 11.7408569 7.4523328 5.6103080 4.9542492 4.9022236
#> V16 V52 V55 V43 V36 V31 V45
#> 4.7491850 4.6593772 4.0892153 3.8385486 3.8154296 3.6716497 3.0195812
#> V15 V54 V33 V3 V49 V35 V4
#> 2.9530430 2.7239859 2.6984784 2.6504328 2.4375914 2.1809172 2.1215538
#> V22 V40 V53 V27 V37 V19 V7
#> 1.9663874 1.9435503 1.6502768 1.6443528 1.3849831 1.0287192 0.9987320
#> V23 V8 V39 V26 V18 V32 V47
#> 0.9423577 0.9256464 0.8226228 0.7918384 0.7835729 0.7467324 0.6866661
#> V1 V21 V38 V17 V25 V20 V13
#> 0.5513605 0.5415936 0.5185820 0.4920542 0.3996497 0.2863317 0.0000000
#> V14 V2 V24 V29 V30 V34 V41
#> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> V42 V44 V46 V5 V50 V56 V57
#> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> V58 V59 V6 V60
#> 0.0000000 0.0000000 0.0000000 0.0000000
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.173913