BlockForest Classification Learner
Source:R/learner_blockForest_classif_blockforest.R
mlr_learners_classif.blockforest.Rd
Random forests for blocks of clinical and omics covariate data.
Calls blockForest::blockfor()
from package blockForest.
In this learner, only the trained forest object ($forest
) is retained. The
optimized block-specific tuning parameters (paramvalues
) and the biased OOB
error estimate (biased_oob_error_donotuse
) are discarded, as they are either
not needed for downstream use or not reliable for performance estimation.
Initial parameter values
num.threads
is initialized to 1 to avoid conflicts with parallelization via future.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, blockForest
Parameters
Id | Type | Default | Levels | Range |
blocks | untyped | - | - | |
block.method | character | BlockForest | BlockForest, RandomBlock, BlockVarSel, VarProb, SplitWeights | - |
num.trees | integer | 2000 | \([1, \infty)\) | |
mtry | untyped | NULL | - | |
nsets | integer | 300 | \([1, \infty)\) | |
num.trees.pre | integer | 1500 | \([1, \infty)\) | |
splitrule | character | extratrees | extratrees, gini | - |
always.select.block | integer | 0 | \([0, 1]\) | |
importance | character | - | none, impurity, impurity_corrected, permutation | - |
num.threads | integer | - | \([1, \infty)\) | |
seed | integer | NULL | \((-\infty, \infty)\) | |
verbose | logical | TRUE | TRUE, FALSE | - |
References
Hornung, R., Wright, N. M (2019). “Block Forests: Random forests for blocks of clinical and omics covariate data.” BMC Bioinformatics, 20(1), 1–17. doi:10.1186/s12859-019-2942-y , https://doi.org/10.1186/s12859-019-2942-y.
See also
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifBlockForest
Methods
Inherited methods
mlr3::Learner$base_learner()
mlr3::Learner$configure()
mlr3::Learner$encapsulate()
mlr3::Learner$format()
mlr3::Learner$help()
mlr3::Learner$predict()
mlr3::Learner$predict_newdata()
mlr3::Learner$print()
mlr3::Learner$reset()
mlr3::Learner$selected_features()
mlr3::Learner$train()
mlr3::LearnerClassif$predict_newdata_fast()
Method importance()
The importance scores are extracted from the model slot variable.importance
.
Returns
Named numeric()
.
Examples
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# check task's features
task$feature_names
#> [1] "V1" "V10" "V11" "V12" "V13" "V14" "V15" "V16" "V17" "V18" "V19" "V2"
#> [13] "V20" "V21" "V22" "V23" "V24" "V25" "V26" "V27" "V28" "V29" "V3" "V30"
#> [25] "V31" "V32" "V33" "V34" "V35" "V36" "V37" "V38" "V39" "V4" "V40" "V41"
#> [37] "V42" "V43" "V44" "V45" "V46" "V47" "V48" "V49" "V5" "V50" "V51" "V52"
#> [49] "V53" "V54" "V55" "V56" "V57" "V58" "V59" "V6" "V60" "V7" "V8" "V9"
# partition features to 2 blocks
blocks = list(bl1 = 1:42, bl2 = 43:60)
# define learner
learner = lrn("classif.blockforest", blocks = blocks,
importance = "permutation", nsets = 10, predict_type = "prob",
num.trees = 50, num.trees.pre = 10, splitrule = "gini")
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
# feature importance
learner$importance()
#> V11 V10 V12 V9 V48
#> 2.007128e-02 1.411729e-02 1.206166e-02 1.123112e-02 9.725881e-03
#> V4 V47 V45 V16 V35
#> 7.358149e-03 6.520847e-03 5.849192e-03 5.710182e-03 5.122188e-03
#> V51 V21 V15 V6 V8
#> 4.728236e-03 4.601349e-03 4.303741e-03 4.233767e-03 4.131522e-03
#> V20 V28 V29 V37 V22
#> 3.802191e-03 3.737910e-03 3.582095e-03 3.384041e-03 3.012218e-03
#> V36 V49 V31 V5 V52
#> 2.919827e-03 2.718692e-03 2.438072e-03 2.279069e-03 2.267894e-03
#> V27 V25 V43 V46 V13
#> 2.214828e-03 2.190153e-03 2.113948e-03 1.999545e-03 1.677667e-03
#> V33 V7 V40 V44 V58
#> 1.676083e-03 1.502669e-03 1.376004e-03 1.312580e-03 1.305250e-03
#> V23 V24 V53 V3 V60
#> 1.000658e-03 9.931478e-04 9.722796e-04 7.437815e-04 6.611196e-04
#> V14 V41 V18 V59 V19
#> 5.729742e-04 4.187138e-04 4.047785e-04 3.953082e-04 3.078718e-04
#> V1 V2 V42 V17 V32
#> 1.897631e-04 4.912238e-06 0.000000e+00 -9.982269e-06 -6.056371e-05
#> V30 V54 V50 V34 V26
#> -8.558355e-05 -2.210958e-04 -3.137006e-04 -4.598775e-04 -7.185999e-04
#> V38 V57 V56 V39 V55
#> -9.445704e-04 -1.531917e-03 -2.798605e-03 -3.032471e-03 -3.487564e-03
# Make predictions for the test observations
pred = learner$predict(task, row_ids = ids$test)
pred
#>
#> ── <PredictionClassif> for 69 observations: ────────────────────────────────────
#> row_ids truth response prob.M prob.R
#> 4 R M 0.5686508 0.4313492
#> 10 R R 0.2389683 0.7610317
#> 13 R R 0.3135000 0.6865000
#> --- --- --- --- ---
#> 198 M M 0.8247540 0.1752460
#> 200 M M 0.7944444 0.2055556
#> 206 M M 0.7860000 0.2140000
# Score the predictions
pred$score()
#> classif.ce
#> 0.2028986