Survival Gradient Boosting Machine Learner
mlr_learners_surv.gbm.Rd
Gradient Boosting for Survival Analysis.
Calls gbm::gbm()
from gbm.
Prediction types
This learner returns two prediction types, using the internal predict.gbm()
function:
lp
: a vector containing the linear predictors (relative risk scores), where each score corresponds to a specific test observation.crank
: same aslp
.
Meta Information
Task type: “surv”
Predict Types: “crank”, “lp”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3proba, mlr3extralearners, gbm
Parameters
Id | Type | Default | Levels | Range |
distribution | character | coxph | coxph | - |
n.trees | integer | 100 | \([1, \infty)\) | |
cv.folds | integer | 0 | \([0, \infty)\) | |
interaction.depth | integer | 1 | \([1, \infty)\) | |
n.minobsinnode | integer | 10 | \([1, \infty)\) | |
shrinkage | numeric | 0.001 | \([0, \infty)\) | |
bag.fraction | numeric | 0.5 | \([0, 1]\) | |
train.fraction | numeric | 1 | \([0, 1]\) | |
keep.data | logical | FALSE | TRUE, FALSE | - |
verbose | logical | FALSE | TRUE, FALSE | - |
var.monotone | untyped | - | - | |
n.cores | integer | 1 | \((-\infty, \infty)\) | |
single.tree | logical | FALSE | TRUE, FALSE | - |
Initial parameter values
distribution
:Actual default:
"bernoulli"
Adjusted default:
"coxph"
Reason for change: This is the only distribution available for survival.
keep.data
:Actual default: TRUE
Adjusted default: FALSE
Reason for change:
keep.data = FALSE
saves memory during model fitting.
n.cores
:Actual default: NULL
Adjusted default:
1
Reason for change: Suppressing the automatic internal parallelization if
cv.folds
> 0 and avoid threading conflicts with future.
References
Friedman, H J (2002). “Stochastic gradient boosting.” Computational statistics & data analysis, 38(4), 367–378.
See also
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner
-> mlr3proba::LearnerSurv
-> LearnerSurvGBM
Methods
Method importance()
The importance scores are extracted from the model slot variable.importance
.
Returns
Named numeric()
.
Examples
# Define the Learner
learner = mlr3::lrn("surv.gbm")
print(learner)
#> <LearnerSurvGBM:surv.gbm>: Gradient Boosting
#> * Model: -
#> * Parameters: distribution=coxph, keep.data=FALSE, n.cores=1
#> * Packages: mlr3, mlr3proba, mlr3extralearners, gbm
#> * Predict Types: [crank], lp
#> * Feature Types: integer, numeric, factor, ordered
#> * Properties: importance, missings, weights
# Define a Task
task = mlr3::tsk("grace")
# Create train and test set
ids = mlr3::partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
#> gbm::gbm(formula = f, distribution = "coxph", data = task$data(),
#> weights = NULL, keep.data = FALSE, n.cores = 1L)
#> A gradient boosted model with coxph loss function.
#> 100 iterations were performed.
#> There were 6 predictors of which 6 had non-zero influence.
print(learner$importance())
#> revascdays revasc age sysbp los stchange
#> 45.851195 24.163989 14.302501 7.274079 5.920204 2.488032
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
#> Using 100 trees...
# Score the predictions
predictions$score()
#> surv.cindex
#> 0.8628286