Classification Random Planted Forest Learner
Source:R/learner_randomPlantedForest_classif_rpf.R
mlr_learners_classif.rpf.RdRandom Planted Forest: A directly interpretable tree ensemble.
Calls randomPlantedForest::rpf() from 'randomPlantedForest'.
Initial parameter values
loss:Actual default:
"L2".Initial value:
"exponential".Reason for change: Using
"L2"(or"L1") loss does not guarantee predictions are valid probabilities and more akin to the linear predictor of a GLM.
Custom mlr3 parameters
max_interaction:This hyperparameter can alternatively be set via
max_interaction_ratioasmax_interaction = max(ceiling(max_interaction_ratio * n_features), 1). The parametermax_interaction_limitcan optionally be set as an upper bound, such thatmax_interaction_ratio * min(n_features, max_interaction_limit)is used instead. This is analogous tomtry.ratioinclassif.ranger, withmax_interaction_limitas an additional constraint. The parametermax_interaction_limitis initialized toInf.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, randomPlantedForest
Parameters
| Id | Type | Default | Levels | Range |
| max_interaction | integer | 1 | \([0, \infty)\) | |
| max_interaction_ratio | numeric | - | \([0, 1]\) | |
| max_interaction_limit | integer | - | \([1, \infty)\) | |
| ntrees | integer | 50 | \([1, \infty)\) | |
| splits | integer | 30 | \([1, \infty)\) | |
| split_try | integer | 10 | \([1, \infty)\) | |
| t_try | numeric | 0.4 | \([0, 1]\) | |
| loss | character | L2 | L1, L2, logit, exponential | - |
| delta | numeric | 1 | \([0, 1]\) | |
| epsilon | numeric | 0.1 | \([0, 1]\) | |
| deterministic | logical | FALSE | TRUE, FALSE | - |
| nthreads | integer | 1 | \([1, \infty)\) | |
| cv | logical | FALSE | TRUE, FALSE | - |
| purify | logical | FALSE | TRUE, FALSE | - |
Installation
Package 'randomPlantedForest' is not on CRAN and has to be installed from GitHub via
remotes::install_github("PlantedML/randomPlantedForest").
References
Hiabu, Munir, Mammen, Enno, Meyer, T. J (2023). “Random Planted Forest: a directly interpretable tree ensemble.” arXiv preprint arXiv:2012.14563. doi:10.48550/ARXIV.2012.14563 .
See also
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRandomPlantedForest
Methods
Inherited methods
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()
Examples
# Define the Learner
learner = lrn("classif.rpf")
print(learner)
#>
#> ── <LearnerClassifRandomPlantedForest> (classif.rpf): Random Planted Forest ────
#> • Model: -
#> • Parameters: max_interaction_limit=Inf, loss=exponential
#> • Packages: mlr3 and randomPlantedForest
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: multiclass and twoclass
#> • Other settings: use_weights = 'error'
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
#> -- Classification Random Planted Forest --
#>
#> Formula: NULL
#> Fit using 60 predictors and main effects only.
#> Forest is _not_ purified!
#>
#> Called with parameters:
#>
#> loss: exponential
#> ntrees: 50
#> max_interaction: 1
#> splits: 30
#> split_try: 10
#> t_try: 0.4
#> delta: 0
#> epsilon: 0.1
#> deterministic: FALSE
#> nthreads: 1
#> purify: FALSE
#> cv: FALSE
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.2753623