Skip to contents

Random forest for classification. Calls randomForest::randomForest() from randomForest.

Dictionary

This Learner can be instantiated via lrn():

lrn("classif.randomForest")

Meta Information

  • Task type: “classif”

  • Predict Types: “response”, “prob”

  • Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”

  • Required Packages: mlr3, mlr3extralearners, randomForest

Parameters

IdTypeDefaultLevelsRange
ntreeinteger500\([1, \infty)\)
mtryinteger-\([1, \infty)\)
replacelogicalTRUETRUE, FALSE-
classwtuntypedNULL-
cutoffuntyped--
stratauntyped--
sampsizeuntyped--
nodesizeinteger1\([1, \infty)\)
maxnodesinteger-\([1, \infty)\)
importancecharacterFALSEaccuracy, gini, none-
localImplogicalFALSETRUE, FALSE-
proximitylogicalFALSETRUE, FALSE-
oob.proxlogical-TRUE, FALSE-
norm.voteslogicalTRUETRUE, FALSE-
do.tracelogicalFALSETRUE, FALSE-
keep.forestlogicalTRUETRUE, FALSE-
keep.inbaglogicalFALSETRUE, FALSE-
predict.alllogicalFALSETRUE, FALSE-
nodeslogicalFALSETRUE, FALSE-

References

Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .

See also

Author

pat-s

Super classes

mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRandomForest

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.


Method importance()

The importance scores are extracted from the slot importance. Parameter 'importance' must be set to either "accuracy" or "gini".

Usage

LearnerClassifRandomForest$importance()

Returns

Named numeric().


Method oob_error()

OOB errors are extracted from the model slot err.rate.

Usage

LearnerClassifRandomForest$oob_error()

Returns

numeric(1).


Method clone()

The objects of this class are cloneable with this method.

Usage

LearnerClassifRandomForest$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Define the Learner
learner = lrn("classif.randomForest", importance = "accuracy")
print(learner)
#> 
#> ── <LearnerClassifRandomForest> (classif.randomForest): Random Forest ──────────
#> • Model: -
#> • Parameters: importance=accuracy
#> • Packages: mlr3, mlr3extralearners, and randomForest
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, multiclass, oob_error, twoclass, and weights
#> • Other settings: use_weights = 'use'

# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)

# Train the learner on the training ids
learner$train(task, row_ids = ids$train)

print(learner$model)
#> 
#> Call:
#>  randomForest(formula = formula, data = data, classwt = classwt,      cutoff = cutoff, importance = TRUE) 
#>                Type of random forest: classification
#>                      Number of trees: 500
#> No. of variables tried at each split: 7
#> 
#>         OOB estimate of  error rate: 14.39%
#> Confusion matrix:
#>    M  R class.error
#> M 60  8   0.1176471
#> R 12 59   0.1690141
print(learner$importance())
#>          V11          V12           V9          V36          V10          V48 
#> 3.448115e-02 3.372397e-02 1.530611e-02 1.189177e-02 1.106419e-02 9.356790e-03 
#>          V49          V21          V13          V52          V20          V31 
#> 8.757093e-03 7.602632e-03 7.158614e-03 6.649958e-03 6.474549e-03 6.307245e-03 
#>          V47          V28           V4          V37          V16          V27 
#> 5.945863e-03 5.897690e-03 5.388130e-03 5.351670e-03 4.581828e-03 4.423813e-03 
#>          V15          V46          V14           V2          V17          V22 
#> 3.614040e-03 3.441548e-03 3.434278e-03 3.096317e-03 2.884624e-03 2.816984e-03 
#>          V18          V43          V51          V54           V8          V53 
#> 2.791509e-03 2.732561e-03 2.631786e-03 2.493405e-03 2.021486e-03 1.973944e-03 
#>          V45          V35           V7          V44           V3          V29 
#> 1.907982e-03 1.806278e-03 1.719787e-03 1.653756e-03 1.648688e-03 1.578448e-03 
#>           V6          V19          V39          V40          V26          V23 
#> 1.444103e-03 1.388466e-03 1.334827e-03 1.316069e-03 1.258623e-03 1.249696e-03 
#>          V59          V55          V30          V42          V24          V25 
#> 1.201216e-03 1.113941e-03 1.045681e-03 1.029433e-03 9.891184e-04 8.833336e-04 
#>          V34          V50          V57          V32          V58          V60 
#> 8.815794e-04 7.953702e-04 7.570283e-04 7.449713e-04 5.716226e-04 4.475982e-04 
#>          V33          V41          V38           V1           V5          V56 
#> 4.206174e-04 4.199607e-04 3.161711e-04 2.242641e-04 7.258823e-05 6.302070e-05 

# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)

# Score the predictions
predictions$score()
#> classif.ce 
#>  0.1449275