Classification Random Forest Learner
Source:R/learner_randomForest_classif_randomForest.R
mlr_learners_classif.randomForest.Rd
Random forest for classification.
Calls randomForest::randomForest()
from randomForest.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, randomForest
Parameters
Id | Type | Default | Levels | Range |
ntree | integer | 500 | \([1, \infty)\) | |
mtry | integer | - | \([1, \infty)\) | |
replace | logical | TRUE | TRUE, FALSE | - |
classwt | untyped | NULL | - | |
cutoff | untyped | - | - | |
strata | untyped | - | - | |
sampsize | untyped | - | - | |
nodesize | integer | 1 | \([1, \infty)\) | |
maxnodes | integer | - | \([1, \infty)\) | |
importance | character | FALSE | accuracy, gini, none | - |
localImp | logical | FALSE | TRUE, FALSE | - |
proximity | logical | FALSE | TRUE, FALSE | - |
oob.prox | logical | - | TRUE, FALSE | - |
norm.votes | logical | TRUE | TRUE, FALSE | - |
do.trace | logical | FALSE | TRUE, FALSE | - |
keep.forest | logical | TRUE | TRUE, FALSE | - |
keep.inbag | logical | FALSE | TRUE, FALSE | - |
predict.all | logical | FALSE | TRUE, FALSE | - |
nodes | logical | FALSE | TRUE, FALSE | - |
References
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .
See also
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifRandomForest
Methods
Inherited methods
mlr3::Learner$base_learner()
mlr3::Learner$configure()
mlr3::Learner$encapsulate()
mlr3::Learner$format()
mlr3::Learner$help()
mlr3::Learner$predict()
mlr3::Learner$predict_newdata()
mlr3::Learner$print()
mlr3::Learner$reset()
mlr3::Learner$selected_features()
mlr3::Learner$train()
mlr3::LearnerClassif$predict_newdata_fast()
Method importance()
The importance scores are extracted from the slot importance
.
Parameter 'importance' must be set to either "accuracy"
or "gini"
.
Returns
Named numeric()
.
Examples
# Define the Learner
learner = lrn("classif.randomForest", importance = "accuracy")
print(learner)
#>
#> ── <LearnerClassifRandomForest> (classif.randomForest): Random Forest ──────────
#> • Model: -
#> • Parameters: importance=accuracy
#> • Packages: mlr3, mlr3extralearners, and randomForest
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, multiclass, oob_error, twoclass, and weights
#> • Other settings: use_weights = 'use'
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
#>
#> Call:
#> randomForest(formula = formula, data = data, classwt = classwt, cutoff = cutoff, importance = TRUE)
#> Type of random forest: classification
#> Number of trees: 500
#> No. of variables tried at each split: 7
#>
#> OOB estimate of error rate: 19.42%
#> Confusion matrix:
#> M R class.error
#> M 71 6 0.07792208
#> R 21 41 0.33870968
print(learner$importance())
#> V11 V12 V48 V9 V47
#> 2.461241e-02 1.544716e-02 1.527828e-02 1.005108e-02 9.566834e-03
#> V10 V21 V49 V36 V45
#> 8.449836e-03 8.263279e-03 7.890766e-03 6.706041e-03 5.635728e-03
#> V46 V52 V37 V51 V13
#> 5.110197e-03 4.764403e-03 4.420774e-03 4.297839e-03 3.945542e-03
#> V4 V20 V27 V28 V16
#> 3.943313e-03 3.920783e-03 3.346983e-03 3.284595e-03 3.184550e-03
#> V17 V1 V15 V23 V29
#> 3.072806e-03 2.950673e-03 2.942748e-03 2.730094e-03 2.469644e-03
#> V43 V24 V18 V31 V35
#> 2.354238e-03 2.104876e-03 1.994756e-03 1.822817e-03 1.702382e-03
#> V14 V19 V6 V22 V44
#> 1.636338e-03 1.596024e-03 1.544131e-03 1.539607e-03 1.503039e-03
#> V8 V40 V33 V53 V30
#> 1.466266e-03 1.333073e-03 1.326329e-03 1.325661e-03 1.073061e-03
#> V39 V42 V26 V32 V34
#> 1.030703e-03 9.612238e-04 9.250498e-04 9.215604e-04 8.241843e-04
#> V38 V55 V2 V58 V54
#> 8.110570e-04 6.799048e-04 6.786599e-04 6.713096e-04 5.177222e-04
#> V50 V3 V59 V41 V7
#> 3.685612e-04 3.076330e-04 2.807651e-04 2.757778e-04 1.615806e-04
#> V57 V56 V5 V25 V60
#> 9.529542e-05 3.843625e-05 -6.632672e-05 -2.880786e-04 -3.831219e-04
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.1449275