Gradient Boosting Classification Learner
Source:R/learner_gbm_classif_gbm.R
mlr_learners_classif.gbm.Rd
Gradient Boosting Classification Algorithm.
Calls gbm::gbm()
from gbm.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, gbm
Parameters
Id | Type | Default | Levels | Range |
distribution | character | bernoulli | bernoulli, adaboost, huberized, multinomial | - |
n.trees | integer | 100 | \([1, \infty)\) | |
interaction.depth | integer | 1 | \([1, \infty)\) | |
n.minobsinnode | integer | 10 | \([1, \infty)\) | |
shrinkage | numeric | 0.001 | \([0, \infty)\) | |
bag.fraction | numeric | 0.5 | \([0, 1]\) | |
train.fraction | numeric | 1 | \([0, 1]\) | |
cv.folds | integer | 0 | \((-\infty, \infty)\) | |
keep.data | logical | FALSE | TRUE, FALSE | - |
verbose | logical | FALSE | TRUE, FALSE | - |
n.cores | integer | 1 | \((-\infty, \infty)\) | |
var.monotone | untyped | - | - |
Initial parameter values
keep.data
is initialized toFALSE
to save memory.n.cores
is initialized to 1 to avoid conflicts with parallelization through future.
References
Friedman, H J (2002). “Stochastic gradient boosting.” Computational statistics & data analysis, 38(4), 367–378.
See also
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifGBM
Methods
Inherited methods
mlr3::Learner$base_learner()
mlr3::Learner$configure()
mlr3::Learner$encapsulate()
mlr3::Learner$format()
mlr3::Learner$help()
mlr3::Learner$predict()
mlr3::Learner$predict_newdata()
mlr3::Learner$print()
mlr3::Learner$reset()
mlr3::Learner$selected_features()
mlr3::Learner$train()
mlr3::LearnerClassif$predict_newdata_fast()
Method importance()
The importance scores are extracted by gbm::relative.influence()
from
the model.
Returns
Named numeric()
.
Examples
# Define the Learner
learner = lrn("classif.gbm")
print(learner)
#>
#> ── <LearnerClassifGBM> (classif.gbm): Gradient Boosting ────────────────────────
#> • Model: -
#> • Parameters: keep.data=FALSE, n.cores=1
#> • Packages: mlr3, mlr3extralearners, and gbm
#> • Predict Types: [response] and prob
#> • Feature Types: integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, missings, twoclass, and weights
#> • Other settings: use_weights = 'use'
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
#> Distribution not specified, assuming bernoulli ...
print(learner$model)
#> gbm::gbm(formula = f, data = data, keep.data = FALSE, n.cores = 1L)
#> A gradient boosted model with bernoulli loss function.
#> 100 iterations were performed.
#> There were 60 predictors of which 39 had non-zero influence.
print(learner$importance())
#> V11 V37 V45 V12 V48 V55 V51
#> 23.5815039 11.4462457 9.0115089 8.2640928 6.6770570 6.4930100 5.3268483
#> V36 V31 V44 V10 V16 V4 V52
#> 5.1805523 4.5324090 4.2570807 4.1467486 3.9196451 3.7976187 2.8610125
#> V20 V34 V46 V5 V39 V57 V15
#> 2.8146249 2.1330937 2.1293190 2.1188198 2.1148052 1.9423557 1.7795206
#> V21 V43 V35 V49 V17 V47 V33
#> 1.4437540 1.2751104 1.1839336 1.1712199 0.9623875 0.9109990 0.8510703
#> V6 V27 V23 V25 V28 V29 V24
#> 0.8410606 0.8125996 0.7325085 0.6734647 0.5340257 0.4954861 0.4476179
#> V13 V14 V8 V53 V1 V18 V19
#> 0.4067810 0.3936904 0.3646881 0.3534359 0.0000000 0.0000000 0.0000000
#> V2 V22 V26 V3 V30 V32 V38
#> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> V40 V41 V42 V50 V54 V56 V58
#> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> V59 V60 V7 V9
#> 0.0000000 0.0000000 0.0000000 0.0000000
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.1014493