Classification Random Forest SRC Learner

Dictionary

This Learner can be instantiated via lrn():

lrn("classif.rfsrc")

Meta Information

Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3extralearners, randomForestSRC

Parameters

Id	Type	Default	Levels	Range
ntree	integer	500		$[1, \infty)$
mtry	integer	-		$[1, \infty)$
mtry.ratio	numeric	-		$[0, 1]$
nodesize	integer	15		$[1, \infty)$
nodedepth	integer	-		$[1, \infty)$
splitrule	character	gini	gini, auc, entropy	-
nsplit	integer	10		$[0, \infty)$
importance	character	FALSE	FALSE, TRUE, none, permute, random, anti	-
block.size	integer	10		$[1, \infty)$
bootstrap	character	by.root	by.root, by.node, none, by.user	-
samptype	character	swor	swor, swr	-
samp	untyped	-		-
membership	logical	FALSE	TRUE, FALSE	-
sampsize	untyped	-		-
sampsize.ratio	numeric	-		$[0, 1]$
na.action	character	na.omit	na.omit, na.impute	-
nimpute	integer	1		$[1, \infty)$
proximity	character	FALSE	FALSE, TRUE, inbag, oob, all	-
distance	character	FALSE	FALSE, TRUE, inbag, oob, all	-
forest.wt	character	FALSE	FALSE, TRUE, inbag, oob, all	-
xvar.wt	untyped	-		-
split.wt	untyped	-		-
forest	logical	TRUE	TRUE, FALSE	-
var.used	character	FALSE	FALSE, all.trees	-
split.depth	character	FALSE	FALSE, all.trees, by.tree	-
seed	integer	-		$(-\infty, -1]$
do.trace	logical	FALSE	TRUE, FALSE	-
get.tree	untyped	-		-
outcome	character	train	train, test	-
ptn.count	integer	0		$[0, \infty)$
cores	integer	1		$[1, \infty)$
save.memory	logical	FALSE	TRUE, FALSE	-
perf.type	character	-	gmean, misclass, brier, none	-
case.depth	logical	FALSE	TRUE, FALSE	-
marginal.xvar	untyped	NULL		-

Custom mlr3 parameters

mtry: This hyperparameter can alternatively be set via the added hyperparameter mtry.ratio as mtry = max(ceiling(mtry.ratio * n_features), 1). Note that mtry and mtry.ratio are mutually exclusive.
sampsize: This hyperparameter can alternatively be set via the added hyperparameter sampsize.ratio as sampsize = max(ceiling(sampsize.ratio * n_obs), 1). Note that sampsize and sampsize.ratio are mutually exclusive.
cores: This value is set as the option rf.cores during training and is set to 1 by default.

References

Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .

Author

RaphaelS1

Super classes

mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRandomForestSRC

Methods

Public methods

LearnerClassifRandomForestSRC$new()
LearnerClassifRandomForestSRC$importance()
LearnerClassifRandomForestSRC$selected_features()
LearnerClassifRandomForestSRC$oob_error()
LearnerClassifRandomForestSRC$clone()

Inherited methods

Method `new()`

Creates a new instance of this R6 class.

Usage

LearnerClassifRandomForestSRC$new()

Method `importance()`

The importance scores are extracted from the model slot importance, returned for 'all'.

Usage

LearnerClassifRandomForestSRC$importance()

Returns

Named numeric().

Method `selected_features()`

Selected features are extracted from the model slot var.used.

Note: Due to a known issue in randomForestSRC, enabling var.used = "all.trees" causes prediction to fail. Therefore, this setting should be used exclusively for feature selection purposes and not when prediction is required.

Usage

LearnerClassifRandomForestSRC$selected_features()

Returns

character().

Method `oob_error()`

OOB error extracted from the model slot err.rate.

Usage

LearnerClassifRandomForestSRC$oob_error()

Returns

numeric().

Method `clone()`

The objects of this class are cloneable with this method.

Usage

LearnerClassifRandomForestSRC$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Define the Learner
learner = lrn("classif.rfsrc", importance = "TRUE")
print(learner)
#> 
#> ── <LearnerClassifRandomForestSRC> (classif.rfsrc): Random Forest ──────────────
#> • Model: -
#> • Parameters: importance=TRUE
#> • Packages: mlr3, mlr3extralearners, and randomForestSRC
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, and factor
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, missings, multiclass, oob_error, selected_features,
#> twoclass, and weights
#> • Other settings: use_weights = 'use'

# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)

# Train the learner on the training ids
learner$train(task, row_ids = ids$train)

print(learner$model)
#>                          Sample size: 139
#>            Frequency of class labels: M=75, R=64
#>                      Number of trees: 500
#>            Forest terminal node size: 1
#>        Average no. of terminal nodes: 17.516
#> No. of variables tried at each split: 8
#>               Total no. of variables: 60
#>        Resampling used to grow trees: swor
#>     Resample size used to grow trees: 88
#>                             Analysis: RF-C
#>                               Family: class
#>                       Splitting rule: gini *random*
#>        Number of random split points: 10
#>                     Imbalanced ratio: 1.1719
#>                    (OOB) Brier score: 0.14570089
#>         (OOB) Normalized Brier score: 0.58280356
#>                            (OOB) AUC: 0.90395833
#>                       (OOB) Log-loss: 0.45631611
#>                         (OOB) PR-AUC: 0.88363236
#>                         (OOB) G-mean: 0.74161985
#>    (OOB) Requested performance error: 0.23741007, 0.12, 0.375
#> 
#> Confusion matrix:
#> 
#>           predicted
#>   observed  M  R class.error
#>          M 66  9       0.120
#>          R 24 40       0.375
#> 
#>       (OOB) Misclassification rate: 0.2374101
#> 
#> Random-classifier baselines (uniform):
#>    Brier: 0.25   Normalized Brier: 1   Log-loss: 0.69314718
print(learner$importance())
#>            V9           V12           V10           V49           V11 
#>  0.0543021272  0.0440493882  0.0426266869  0.0405897887  0.0393939980 
#>           V52           V48           V37           V30           V28 
#>  0.0388610088  0.0208475964  0.0190351895  0.0187855962  0.0181797944 
#>           V51           V20           V36           V23           V47 
#>  0.0178610501  0.0174614778  0.0168540534  0.0165995347  0.0159111205 
#>           V40            V5           V22           V17           V21 
#>  0.0143836942  0.0126405945  0.0125613722  0.0117503545  0.0116597562 
#>           V42           V44           V15           V39           V13 
#>  0.0110586560  0.0103113523  0.0100514625  0.0097603131  0.0095613989 
#>           V46            V1            V4           V18           V41 
#>  0.0094205348  0.0091918710  0.0090690016  0.0088283747  0.0087154511 
#>           V45            V8           V38           V31           V54 
#>  0.0084488315  0.0084146889  0.0079892614  0.0079833302  0.0077252953 
#>           V32           V16           V55           V43           V33 
#>  0.0075879389  0.0068236028  0.0065486640  0.0063993684  0.0061108506 
#>           V29           V60           V26           V19            V6 
#>  0.0059533031  0.0058282284  0.0056869609  0.0052452986  0.0052420621 
#>           V58           V25            V7           V34           V56 
#>  0.0047807368  0.0046792511  0.0046553528  0.0042082180  0.0038056571 
#>           V57           V59           V24            V3           V27 
#>  0.0034603827  0.0030605607  0.0029114765  0.0023255149  0.0021860627 
#>           V50           V35           V14            V2           V53 
#>  0.0021581509  0.0017530119  0.0016122668  0.0008739364 -0.0001417546 

# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)

# Score the predictions
predictions$score()
#> classif.ce 
#>   0.115942

Dictionary

Meta Information

Parameters

Custom mlr3 parameters

References

See also

Author

Super classes

Methods

Public methods

Method new()

Usage

Method importance()

Usage

Returns

Method selected_features()

Usage

Returns

Method oob_error()

Usage

Returns

Method clone()

Usage

Arguments

Examples

Method `new()`

Method `importance()`

Method `selected_features()`

Method `oob_error()`

Method `clone()`