Oblique Random Forest Classifier
Source:R/learner_aorsf_classif_aorsf.R
mlr_learners_classif.aorsf.RdAccelerated oblique random classification forest.
Calls aorsf::orsf() from aorsf.
Note that although the learner has the property "missing" and it can in
principle deal with missing values, the behavior has to be configured using
the parameter na_action.
Initial parameter values
n_thread: This parameter is initialized to 1 (default is 0) to avoid conflicts with the mlr3 parallelization.pred_simplifyhas to be TRUE, otherwise response is NA in prediction
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, aorsf
Parameters
| Id | Type | Default | Levels | Range |
| attach_data | logical | TRUE | TRUE, FALSE | - |
| epsilon | numeric | 1e-09 | \([0, \infty)\) | |
| importance | character | anova | none, anova, negate, permute | - |
| importance_max_pvalue | numeric | 0.01 | \([1e-04, 0.9999]\) | |
| leaf_min_events | integer | 1 | \([1, \infty)\) | |
| leaf_min_obs | integer | 5 | \([1, \infty)\) | |
| max_iter | integer | 20 | \([1, \infty)\) | |
| method | character | glm | glm, net, pca, random | - |
| mtry | integer | NULL | \([1, \infty)\) | |
| mtry_ratio | numeric | - | \([0, 1]\) | |
| n_retry | integer | 3 | \([0, \infty)\) | |
| n_split | integer | 5 | \([1, \infty)\) | |
| n_thread | integer | - | \([0, \infty)\) | |
| n_tree | integer | 500 | \([1, \infty)\) | |
| na_action | character | fail | fail, impute_meanmode | - |
| net_mix | numeric | 0.5 | \((-\infty, \infty)\) | |
| oobag | logical | FALSE | TRUE, FALSE | - |
| oobag_eval_every | integer | NULL | \([1, \infty)\) | |
| oobag_fun | untyped | NULL | - | |
| oobag_pred_type | character | prob | none, leaf, prob, class | - |
| pred_aggregate | logical | TRUE | TRUE, FALSE | - |
| sample_fraction | numeric | 0.632 | \([0, 1]\) | |
| sample_with_replacement | logical | TRUE | TRUE, FALSE | - |
| scale_x | logical | FALSE | TRUE, FALSE | - |
| split_min_events | integer | 5 | \([1, \infty)\) | |
| split_min_obs | integer | 10 | \([1, \infty)\) | |
| split_min_stat | numeric | NULL | \([0, \infty)\) | |
| split_rule | character | gini | gini, cstat | - |
| target_df | integer | NULL | \([1, \infty)\) | |
| tree_seeds | integer | NULL | \([1, \infty)\) | |
| verbose_progress | logical | FALSE | TRUE, FALSE | - |
See also
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifObliqueRandomForest
Methods
Inherited methods
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()
LearnerClassifObliqueRandomForest$oob_error()
OOB concordance error extracted from the model slot
eval_oobag$stat_values
LearnerClassifObliqueRandomForest$importance()
The importance scores are extracted from the model.
Returns
Named numeric().
Examples
# Define the Learner
learner = lrn("classif.aorsf")
print(learner)
#>
#> ── <LearnerClassifObliqueRandomForest> (classif.aorsf): Oblique Random Forest Cl
#> • Model: -
#> • Parameters: n_thread=1
#> • Packages: mlr3, mlr3extralearners, and aorsf
#> • Predict Types: [response] and prob
#> • Feature Types: integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, missings, multiclass, oob_error, twoclass, and
#> weights
#> • Other settings: use_weights = 'use', predict_raw = 'FALSE'
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
#> ---------- Oblique random classification forest
#>
#> Linear combinations: Logistic regression
#> N observations: 139
#> N classes: 2
#> N trees: 500
#> N predictors total: 60
#> N predictors per node: 8
#> Average leaves per tree: 4.908
#> Min observations in leaf: 5
#> OOB stat value: 0.92
#> OOB stat type: AUC-ROC
#> Variable importance: anova
#>
#> -----------------------------------------
print(learner$importance())
#> V49 V51 V11 V47 V36 V12 V48
#> 0.42639594 0.37755102 0.37387387 0.37254902 0.36040609 0.35467980 0.34188034
#> V37 V9 V10 V35 V5 V43 V1
#> 0.33936652 0.30687831 0.30612245 0.26530612 0.25000000 0.24761905 0.22807018
#> V46 V22 V20 V52 V21 V45 V50
#> 0.22167488 0.20942408 0.19791667 0.19138756 0.18918919 0.18905473 0.17837838
#> V13 V44 V19 V23 V31 V4 V16
#> 0.17766497 0.17171717 0.13761468 0.12568306 0.12217195 0.12019231 0.10628019
#> V58 V15 V17 V18 V42 V57 V33
#> 0.10138249 0.10087719 0.09478673 0.09333333 0.09322034 0.08695652 0.08333333
#> V28 V8 V34 V59 V41 V2 V29
#> 0.08296943 0.07804878 0.07537688 0.07462687 0.07142857 0.07103825 0.07000000
#> V54 V24 V14 V39 V32 V30 V26
#> 0.06912442 0.06603774 0.06593407 0.06493506 0.05555556 0.05527638 0.05045872
#> V27 V40 V7 V3 V60 V53 V6
#> 0.04761905 0.04663212 0.04522613 0.04435484 0.04188482 0.03755869 0.03296703
#> V38 V56 V55 V25
#> 0.02857143 0.02690583 0.02272727 0.02222222
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.2753623