Skip to contents

Random forest for classification. Calls randomForest::randomForest() from randomForest.

Dictionary

This Learner can be instantiated via lrn():

lrn("classif.randomForest")

Meta Information

  • Task type: “classif”

  • Predict Types: “response”, “prob”

  • Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”

  • Required Packages: mlr3, mlr3extralearners, randomForest

Parameters

IdTypeDefaultLevelsRange
ntreeinteger500\([1, \infty)\)
mtryinteger-\([1, \infty)\)
replacelogicalTRUETRUE, FALSE-
classwtuntypedNULL-
cutoffuntyped--
stratauntyped--
sampsizeuntyped--
nodesizeinteger1\([1, \infty)\)
maxnodesinteger-\([1, \infty)\)
importancecharacterFALSEaccuracy, gini, none-
localImplogicalFALSETRUE, FALSE-
proximitylogicalFALSETRUE, FALSE-
oob.proxlogical-TRUE, FALSE-
norm.voteslogicalTRUETRUE, FALSE-
do.tracelogicalFALSETRUE, FALSE-
keep.forestlogicalTRUETRUE, FALSE-
keep.inbaglogicalFALSETRUE, FALSE-
predict.alllogicalFALSETRUE, FALSE-
nodeslogicalFALSETRUE, FALSE-

References

Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .

See also

Author

pat-s

Super classes

mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRandomForest

Methods

Inherited methods


LearnerClassifRandomForest$new()

Creates a new instance of this R6 class.


LearnerClassifRandomForest$importance()

The importance scores are extracted from the slot importance. Parameter 'importance' must be set to either "accuracy" or "gini".

Usage

LearnerClassifRandomForest$importance()

Returns

Named numeric().


LearnerClassifRandomForest$oob_error()

OOB errors are extracted from the model slot err.rate.

Usage

LearnerClassifRandomForest$oob_error()

Returns

numeric(1).


LearnerClassifRandomForest$clone()

The objects of this class are cloneable with this method.

Usage

LearnerClassifRandomForest$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Define the Learner
learner = lrn("classif.randomForest", importance = "accuracy")
print(learner)
#> 
#> ── <LearnerClassifRandomForest> (classif.randomForest): Random Forest ──────────
#> • Model: -
#> • Parameters: importance=accuracy
#> • Packages: mlr3, mlr3extralearners, and randomForest
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, multiclass, oob_error, twoclass, and weights
#> • Other settings: use_weights = 'use', predict_raw = 'FALSE'

# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)

# Train the learner on the training ids
learner$train(task, row_ids = ids$train)

print(learner$model)
#> 
#> Call:
#>  randomForest(formula = formula, data = data, classwt = classwt,      cutoff = cutoff, importance = TRUE) 
#>                Type of random forest: classification
#>                      Number of trees: 500
#> No. of variables tried at each split: 7
#> 
#>         OOB estimate of  error rate: 15.83%
#> Confusion matrix:
#>    M  R class.error
#> M 63  9   0.1250000
#> R 13 54   0.1940299
print(learner$importance())
#>           V11           V12           V10            V9           V27 
#>  2.056405e-02  1.851822e-02  1.126801e-02  1.107798e-02  7.864181e-03 
#>           V48           V47           V37           V36           V17 
#>  7.790243e-03  7.629455e-03  7.148193e-03  6.125872e-03  5.681058e-03 
#>           V15           V51           V13           V49           V18 
#>  5.468498e-03  5.208916e-03  5.191214e-03  5.048685e-03  4.748630e-03 
#>           V45           V43           V28           V16           V52 
#>  4.741127e-03  4.677790e-03  4.286647e-03  4.269438e-03  3.896066e-03 
#>           V35           V20           V23           V46           V44 
#>  3.877863e-03  3.814270e-03  3.780875e-03  3.398750e-03  2.753819e-03 
#>           V21           V31           V42           V30           V40 
#>  2.525773e-03  2.143617e-03  2.044686e-03  1.999084e-03  1.964640e-03 
#>           V34           V29           V26            V6           V39 
#>  1.791392e-03  1.742247e-03  1.712115e-03  1.655545e-03  1.650808e-03 
#>            V4           V22            V2           V24           V58 
#>  1.583947e-03  1.577984e-03  1.497449e-03  1.285092e-03  1.269948e-03 
#>            V8           V53           V55           V19           V50 
#>  1.262469e-03  1.219547e-03  9.172683e-04  8.949251e-04  6.891141e-04 
#>            V5            V3           V38           V59           V14 
#>  6.372740e-04  5.304904e-04  4.755301e-04  4.500029e-04  4.422899e-04 
#>           V33           V54            V1           V32           V56 
#>  4.340487e-04  4.272002e-04  2.845862e-04  2.754290e-04  1.895953e-04 
#>           V25           V57           V60            V7           V41 
#>  1.679636e-04  1.617014e-04 -6.174969e-05 -1.354207e-04 -7.446964e-04 

# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)

# Score the predictions
predictions$score()
#> classif.ce 
#>  0.1594203