Regression BART (Bayesian Additive Regression Trees) Learner
mlr_learners_regr.bart.Rd
Bayesian Additive Regression Trees are similar to gradient boosting algorithms.
Calls dbarts::bart()
from dbarts.
Meta Information
Task type: “regr”
Predict Types: “response”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, dbarts
Parameters
Id | Type | Default | Levels | Range |
ntree | integer | 200 | \([1, \infty)\) | |
sigest | untyped | NULL | - | |
sigdf | integer | 3 | \([1, \infty)\) | |
sigquant | numeric | 0.9 | \([0, 1]\) | |
k | numeric | 2 | \([0, \infty)\) | |
power | numeric | 2 | \([0, \infty)\) | |
base | numeric | 0.95 | \([0, 1]\) | |
ndpost | integer | 1000 | \([1, \infty)\) | |
nskip | integer | 100 | \([0, \infty)\) | |
printevery | integer | 100 | \([0, \infty)\) | |
keepevery | integer | 1 | \([1, \infty)\) | |
keeptrainfits | logical | TRUE | TRUE, FALSE | - |
usequants | logical | FALSE | TRUE, FALSE | - |
numcut | integer | 100 | \([1, \infty)\) | |
printcutoffs | integer | 0 | \((-\infty, \infty)\) | |
verbose | logical | FALSE | TRUE, FALSE | - |
nthread | integer | 1 | \((-\infty, \infty)\) | |
keeptrees | logical | FALSE | TRUE, FALSE | - |
keepcall | logical | TRUE | TRUE, FALSE | - |
sampleronly | logical | FALSE | TRUE, FALSE | - |
seed | integer | NA | \((-\infty, \infty)\) | |
proposalprobs | untyped | NULL | - | |
splitprobs | untyped | NULL | - | |
keepsampler | logical | - | TRUE, FALSE | - |
Custom mlr3 parameters
Parameter:
offset
The parameter is removed, because only
dbarts::bart2
allows an offset during training, and therefore the offset parameter indbarts:::predict.bart
is irrelevant fordbarts::dbart
.
Parameter:
nchain
,combineChains
,combinechains
The parameters are removed as parallelization of multiple models is handled by future.
References
Sparapani, Rodney, Spanbauer, Charles, McCulloch, Robert (2021). “Nonparametric machine learning and efficient computation with bayesian additive regression trees: the BART R package.” Journal of Statistical Software, 97, 1–66.
Chipman, A H, George, I E, McCulloch, E R (2010). “BART: Bayesian additive regression trees.” The Annals of Applied Statistics, 4(1), 266–298.
See also
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrBart
Examples
# Define the Learner
learner = mlr3::lrn("regr.bart")
print(learner)
#> <LearnerRegrBart:regr.bart>: Bayesian Additive Regression Trees
#> * Model: -
#> * Parameters: keeptrees=TRUE
#> * Packages: mlr3, mlr3extralearners, dbarts
#> * Predict Types: [response]
#> * Feature Types: integer, numeric, factor, ordered
#> * Properties: weights
# Define a Task
task = mlr3::tsk("mtcars")
# Create train and test set
ids = mlr3::partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
#>
#> Running BART with numeric y
#>
#> number of trees: 200
#> number of chains: 1, number of threads 1
#> tree thinning rate: 1
#> Prior:
#> k prior fixed to 2.000000
#> degrees of freedom in sigma prior: 3.000000
#> quantile in sigma prior: 0.900000
#> scale in sigma prior: 0.002230
#> power and base for tree prior: 2.000000 0.950000
#> use quantiles for rule cut points: false
#> proposal probabilities: birth/death 0.50, swap 0.10, change 0.40; birth 0.50
#> data:
#> number of training observations: 21
#> number of test observations: 0
#> number of explanatory variables: 10
#> init sigma: 1.829733, curr sigma: 1.829733
#>
#> Cutoff rules c in x<=c vs x>c
#> Number of cutoffs: (var: number of possible c):
#> (1: 100) (2: 100) (3: 100) (4: 100) (5: 100)
#> (6: 100) (7: 100) (8: 100) (9: 100) (10: 100)
#>
#> Running mcmc loop:
#> iteration: 100 (of 1000)
#> iteration: 200 (of 1000)
#> iteration: 300 (of 1000)
#> iteration: 400 (of 1000)
#> iteration: 500 (of 1000)
#> iteration: 600 (of 1000)
#> iteration: 700 (of 1000)
#> iteration: 800 (of 1000)
#> iteration: 900 (of 1000)
#> iteration: 1000 (of 1000)
#> total seconds in loop: 0.266385
#>
#> Tree sizes, last iteration:
#> [1] 2 2 3 3 2 3 2 2 2 2 2 2 3 2 2 2 2 2
#> 2 3 2 2 1 2 2 2 4 3 3 2 2 4 1 2 2 2 2 3
#> 1 4 2 2 2 2 2 2 2 2 3 1 2 1 2 2 3 2 3 2
#> 2 2 2 3 2 2 2 2 2 4 2 2 4 2 1 2 3 2 3 3
#> 2 2 2 2 2 2 2 2 2 4 2 2 2 2 2 2 2 2 2 2
#> 3 2 2 2 3 2 1 2 2 2 2 2 3 2 3 2 2 3 2 3
#> 2 3 2 2 2 2 3 4 2 3 1 3 3 2 2 2 4 2 3 2
#> 2 2 2 2 3 2 2 2 2 2 2 2 2 4 2 2 2 2 3 3
#> 2 3 2 2 2 3 2 3 2 2 2 2 2 2 3 3 2 2 2 2
#> 2 2 3 2 2 2 4 4 4 2 3 4 4 3 3 2 2 3 2 1
#> 2 3
#>
#> Variable Usage, last iteration (var:count):
#> (1: 30) (2: 27) (3: 16) (4: 20) (5: 32)
#> (6: 23) (7: 22) (8: 21) (9: 30) (10: 39)
#>
#> DONE BART
#>
print(learner$model)
#>
#> Call:
#> dbarts::bart(x.train = data, y.train = outcome, keeptrees = TRUE)
#>
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> regr.mse
#> 15.86164