Skip to contents

Random forest for classification. Calls randomForest::randomForest() from randomForest.

Dictionary

This Learner can be instantiated via lrn():

lrn("classif.randomForest")

Meta Information

  • Task type: “classif”

  • Predict Types: “response”, “prob”

  • Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”

  • Required Packages: mlr3, mlr3extralearners, randomForest

Parameters

IdTypeDefaultLevelsRange
ntreeinteger500\([1, \infty)\)
mtryinteger-\([1, \infty)\)
replacelogicalTRUETRUE, FALSE-
classwtuntypedNULL-
cutoffuntyped--
stratauntyped--
sampsizeuntyped--
nodesizeinteger1\([1, \infty)\)
maxnodesinteger-\([1, \infty)\)
importancecharacterFALSEaccuracy, gini, none-
localImplogicalFALSETRUE, FALSE-
proximitylogicalFALSETRUE, FALSE-
oob.proxlogical-TRUE, FALSE-
norm.voteslogicalTRUETRUE, FALSE-
do.tracelogicalFALSETRUE, FALSE-
keep.forestlogicalTRUETRUE, FALSE-
keep.inbaglogicalFALSETRUE, FALSE-
predict.alllogicalFALSETRUE, FALSE-
nodeslogicalFALSETRUE, FALSE-

References

Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565. doi:10.1023/A:1010933404324 .

See also

Author

pat-s

Super classes

mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRandomForest

Methods

Inherited methods


LearnerClassifRandomForest$new()

Creates a new instance of this R6 class.


LearnerClassifRandomForest$importance()

The importance scores are extracted from the slot importance. Parameter 'importance' must be set to either "accuracy" or "gini".

Usage

LearnerClassifRandomForest$importance()

Returns

Named numeric().


LearnerClassifRandomForest$oob_error()

OOB errors are extracted from the model slot err.rate.

Usage

LearnerClassifRandomForest$oob_error()

Returns

numeric(1).


LearnerClassifRandomForest$clone()

The objects of this class are cloneable with this method.

Usage

LearnerClassifRandomForest$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Define the Learner
learner = lrn("classif.randomForest", importance = "accuracy")
print(learner)
#> 
#> ── <LearnerClassifRandomForest> (classif.randomForest): Random Forest ──────────
#> • Model: -
#> • Parameters: importance=accuracy
#> • Packages: mlr3, mlr3extralearners, and randomForest
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, multiclass, oob_error, twoclass, and weights
#> • Other settings: use_weights = 'use', predict_raw = 'FALSE'

# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)

# Train the learner on the training ids
learner$train(task, row_ids = ids$train)

print(learner$model)
#> 
#> Call:
#>  randomForest(formula = formula, data = data, classwt = classwt,      cutoff = cutoff, importance = TRUE) 
#>                Type of random forest: classification
#>                      Number of trees: 500
#> No. of variables tried at each split: 7
#> 
#>         OOB estimate of  error rate: 16.55%
#> Confusion matrix:
#>    M  R class.error
#> M 70  9   0.1139241
#> R 14 46   0.2333333
print(learner$importance())
#>           V11           V12           V10            V9           V31 
#>  0.0321829752  0.0253208948  0.0203196406  0.0128816750  0.0106049670 
#>            V4           V52           V21           V20           V27 
#>  0.0074348730  0.0072066092  0.0066921531  0.0063164258  0.0058970089 
#>           V13           V17           V37           V15           V49 
#>  0.0058832634  0.0048607037  0.0045686832  0.0045678878  0.0044886465 
#>           V18            V5           V26           V28           V48 
#>  0.0039924204  0.0038740225  0.0034708730  0.0034273062  0.0033734592 
#>           V51           V39           V46           V36           V16 
#>  0.0030587378  0.0030502175  0.0028644921  0.0028045260  0.0027735318 
#>           V23           V47            V8           V22           V43 
#>  0.0026768724  0.0024676911  0.0024229940  0.0023328857  0.0021946865 
#>           V32           V44           V40           V35           V24 
#>  0.0021658089  0.0020904072  0.0020296441  0.0016381571  0.0015997682 
#>           V45           V30           V59            V3           V25 
#>  0.0015835699  0.0015732013  0.0015613091  0.0014595255  0.0014491378 
#>           V55           V57            V2           V19            V6 
#>  0.0012916944  0.0011863696  0.0009026196  0.0008718790  0.0008496298 
#>           V53           V29           V14           V41           V42 
#>  0.0007455963  0.0007137262  0.0006815615  0.0005700172  0.0005320781 
#>           V34           V50           V58           V38           V33 
#>  0.0004965598  0.0004409074  0.0004135570  0.0004065958  0.0003049074 
#>           V54            V1           V60           V56            V7 
#>  0.0003002766  0.0002032145  0.0001799077  0.0001136230 -0.0006870286 

# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)

# Score the predictions
predictions$score()
#> classif.ce 
#>  0.2318841