Survival Random Forest SRC Learner

Prediction types

This learner returns two prediction types:

distr: a survival matrix in two dimensions, where observations are represented in rows and (unique event) time points in columns. Calculated using the internal randomForestSRC::predict.rfsrc() function.
crank: the expected mortality using mlr3proba::.surv_return().

Dictionary

This Learner can be instantiated via lrn():

lrn("surv.rfsrc")

Meta Information

Task type: “surv”
Predict Types: “crank”, “distr”
Feature Types: “logical”, “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3proba, mlr3extralearners, randomForestSRC, pracma

Parameters

Id	Type	Default	Levels	Range
ntree	integer	500		$[1, \infty)$
mtry	integer	-		$[1, \infty)$
mtry.ratio	numeric	-		$[0, 1]$
nodesize	integer	15		$[1, \infty)$
nodedepth	integer	-		$[1, \infty)$
splitrule	character	logrank	logrank, bs.gradient, logrankscore	-
nsplit	integer	10		$[0, \infty)$
importance	character	FALSE	FALSE, TRUE, none, permute, random, anti	-
block.size	integer	10		$[1, \infty)$
bootstrap	character	by.root	by.root, by.node, none, by.user	-
samptype	character	swor	swor, swr	-
samp	untyped	-		-
membership	logical	FALSE	TRUE, FALSE	-
sampsize	untyped	-		-
sampsize.ratio	numeric	-		$[0, 1]$
na.action	character	na.omit	na.omit, na.impute	-
nimpute	integer	1		$[1, \infty)$
ntime	integer	150		$[0, \infty)$
cause	integer	-		$[1, \infty)$
proximity	character	FALSE	FALSE, TRUE, inbag, oob, all	-
distance	character	FALSE	FALSE, TRUE, inbag, oob, all	-
forest.wt	character	FALSE	FALSE, TRUE, inbag, oob, all	-
xvar.wt	untyped	-		-
split.wt	untyped	-		-
forest	logical	TRUE	TRUE, FALSE	-
var.used	character	FALSE	FALSE, all.trees, by.tree	-
split.depth	character	FALSE	FALSE, all.trees, by.tree	-
seed	integer	-		$(-\infty, -1]$
do.trace	logical	FALSE	TRUE, FALSE	-
statistics	logical	FALSE	TRUE, FALSE	-
get.tree	untyped	-		-
outcome	character	train	train, test	-
ptn.count	integer	0		$[0, \infty)$
estimator	character	nelson	nelson, kaplan	-
cores	integer	1		$[1, \infty)$
save.memory	logical	FALSE	TRUE, FALSE	-
perf.type	character	-	none	-
case.depth	logical	FALSE	TRUE, FALSE	-
marginal.xvar	untyped	NULL		-

Custom mlr3 parameters

estimator: Hidden parameter that controls the type of estimator used to derive the survival function during prediction. The default value is "chf" which uses a bootstrapped Nelson-Aalen estimator for the cumulative hazard function $H(t)$, (Ishwaran, 2008) from which we calculate $S(t) = \exp(-H(t))$, whereas "surv" uses a bootstrapped Kaplan-Meier estimator to directly estimate $S(t)$.

mtry: This hyperparameter can alternatively be set via the added hyperparameter mtry.ratio as mtry = max(ceiling(mtry.ratio * n_features), 1). Note that mtry and mtry.ratio are mutually exclusive.
sampsize: This hyperparameter can alternatively be set via the added hyperparameter sampsize.ratio as sampsize = max(ceiling(sampsize.ratio * n_obs), 1). Note that sampsize and sampsize.ratio are mutually exclusive.
cores: This value is set as the option rf.cores during training and is set to 1 by default.

Initial parameter values

ntime: Number of time points to coerce the observed event times for use in the estimated survival function during prediction. We changed the default value of 150 to 0 in order to be in line with other random survival forest learners and use all the unique event times from the train set.

References

Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008). “Random survival forests.” The Annals of Applied Statistics, 2(3). doi:10.1214/08-aoas169 , https://doi.org/10.1214/08-aoas169.

Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .

Author

RaphaelS1

Super classes

mlr3::Learner -> mlr3proba::LearnerSurv -> LearnerSurvRandomForestSRC

Methods

Public methods

LearnerSurvRandomForestSRC$new()
LearnerSurvRandomForestSRC$importance()
LearnerSurvRandomForestSRC$selected_features()
LearnerSurvRandomForestSRC$oob_error()
LearnerSurvRandomForestSRC$clone()

Inherited methods

Method `new()`

Creates a new instance of this R6 class.

Usage

LearnerSurvRandomForestSRC$new()

Method `importance()`

The importance scores are extracted from the model slot importance.

Usage

LearnerSurvRandomForestSRC$importance()

Returns

Named numeric().

Method `selected_features()`

Selected features are extracted from the model slot var.used.

Usage

LearnerSurvRandomForestSRC$selected_features()

Returns

character().

Method `oob_error()`

OOB error extracted from the model slot err.rate.

Usage

LearnerSurvRandomForestSRC$oob_error()

Returns

numeric().

Method `clone()`

The objects of this class are cloneable with this method.

Usage

LearnerSurvRandomForestSRC$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Define the Learner
learner = mlr3::lrn("surv.rfsrc", importance = "TRUE")
print(learner)
#> 
#> ── <LearnerSurvRandomForestSRC> (surv.rfsrc): Random Forest ────────────────────
#> • Model: -
#> • Parameters: importance=TRUE, ntime=0
#> • Packages: mlr3, mlr3proba, mlr3extralearners, randomForestSRC, and pracma
#> • Predict Types: [crank] and distr
#> • Feature Types: logical, integer, numeric, and factor
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, missings, oob_error, and weights
#> • Other settings: use_weights = 'use'

# Define a Task
task = mlr3::tsk("grace")

# Create train and test set
ids = mlr3::partition(task)

# Train the learner on the training ids
learner$train(task, row_ids = ids$train)

print(learner$model)
#>                          Sample size: 670
#>                     Number of deaths: 224
#>                      Number of trees: 500
#>            Forest terminal node size: 15
#>        Average no. of terminal nodes: 27.23
#> No. of variables tried at each split: 3
#>               Total no. of variables: 6
#>        Resampling used to grow trees: swor
#>     Resample size used to grow trees: 423
#>                             Analysis: RSF
#>                               Family: surv
#>                       Splitting rule: logrank *random*
#>        Number of random split points: 10
#>                           (OOB) CRPS: 16.78840411
#>                    (OOB) stand. CRPS: 0.09484974
#>    (OOB) Requested performance error: 0.16341483
#> 
print(learner$importance())
#>  revascdays      revasc         age         los       sysbp    stchange 
#> 0.411877824 0.265766272 0.140804526 0.101042397 0.050669452 0.006374135 

# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)

# Score the predictions
predictions$score()
#> surv.cindex 
#>   0.8328082

Prediction types

Dictionary

Meta Information

Parameters

Custom mlr3 parameters

Initial parameter values

References

See also

Author

Super classes

Methods

Public methods

Method new()

Usage

Method importance()

Usage

Returns

Method selected_features()

Usage

Returns

Method oob_error()

Usage

Returns

Method clone()

Usage

Arguments

Examples

Method `new()`

Method `importance()`

Method `selected_features()`

Method `oob_error()`

Method `clone()`