Skip to contents

eXtreme Gradient Boosting regression. Calls xgboost::xgb.train() from package xgboost.

Note: We strongly advise to use the separate Cox and AFT xgboost survival learners since they represent two very distinct survival modeling methods and we offer more prediction types in the respective learners compared to the ones available here. This learner will be deprecated in the future.

Note

To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.

Initial parameter values

  • nrounds is initialized to 1.

  • nthread is initialized to 1 to avoid conflicts with parallelization via future.

  • verbose is initialized to 0.

  • objective is initialized to survival:cox for survival analysis.

Dictionary

This Learner can be instantiated via the dictionary mlr_learners or with the associated sugar function lrn():

mlr_learners$get("surv.xgboost")
lrn("surv.xgboost")

Meta Information

Parameters

IdTypeDefaultLevelsRange
aft_loss_distributioncharacternormalnormal, logistic, extreme-
aft_loss_distribution_scalenumeric-\((-\infty, \infty)\)
alphanumeric0\([0, \infty)\)
base_scorenumeric0.5\((-\infty, \infty)\)
boostercharactergbtreegbtree, gblinear, dart-
callbacksuntypedlist()-
colsample_bylevelnumeric1\([0, 1]\)
colsample_bynodenumeric1\([0, 1]\)
colsample_bytreenumeric1\([0, 1]\)
disable_default_eval_metriclogicalFALSETRUE, FALSE-
early_stopping_roundsintegerNULL\([1, \infty)\)
early_stopping_setcharacternonenone, train, test-
etanumeric0.3\([0, 1]\)
feature_selectorcharactercycliccyclic, shuffle, random, greedy, thrifty-
fevaluntypedNULL-
gammanumeric0\([0, \infty)\)
grow_policycharacterdepthwisedepthwise, lossguide-
interaction_constraintsuntyped--
iterationrangeuntyped--
lambdanumeric1\([0, \infty)\)
lambda_biasnumeric0\([0, \infty)\)
max_bininteger256\([2, \infty)\)
max_delta_stepnumeric0\([0, \infty)\)
max_depthinteger6\([0, \infty)\)
max_leavesinteger0\([0, \infty)\)
maximizelogicalNULLTRUE, FALSE-
min_child_weightnumeric1\([0, \infty)\)
missingnumericNA\((-\infty, \infty)\)
monotone_constraintsinteger0\([-1, 1]\)
normalize_typecharactertreetree, forest-
nroundsinteger-\([1, \infty)\)
nthreadinteger1\([1, \infty)\)
num_parallel_treeinteger1\([1, \infty)\)
objectivecharactersurvival:coxsurvival:cox, survival:aft-
one_droplogicalFALSETRUE, FALSE-
print_every_ninteger1\([1, \infty)\)
process_typecharacterdefaultdefault, update-
rate_dropnumeric0\([0, 1]\)
refresh_leaflogicalTRUETRUE, FALSE-
sampling_methodcharacteruniformuniform, gradient_based-
sample_typecharacteruniformuniform, weighted-
save_nameuntyped--
save_periodinteger-\([0, \infty)\)
scale_pos_weightnumeric1\((-\infty, \infty)\)
seed_per_iterationlogicalFALSETRUE, FALSE-
skip_dropnumeric0\([0, 1]\)
strict_shapelogicalFALSETRUE, FALSE-
subsamplenumeric1\([0, 1]\)
top_kinteger0\([0, \infty)\)
tree_methodcharacterautoauto, exact, approx, hist, gpu_hist-
tweedie_variance_powernumeric1.5\([1, 2]\)
updateruntyped--
verboseinteger1\([0, 2]\)
watchlistuntypedNULL-
xgb_modeluntyped--
deviceuntyped--

Early stopping

Early stopping can be used to find the optimal number of boosting rounds. The early_stopping_set parameter controls which set is used to monitor the performance. By default, early_stopping_set = "none" which disables early stopping. Set early_stopping_set = "test" to monitor the performance of the model on the test set while training. The test set for early stopping can be set with the "test" row role in the mlr3::Task. Additionally, the range must be set in which the performance must increase with early_stopping_rounds and the maximum number of boosting rounds with nrounds. While resampling, the test set is automatically applied from the mlr3::Resampling. Not that using the test set for early stopping can potentially bias the performance scores.

References

Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785--794. ACM. doi:10.1145/2939672.2939785 .

See also

Author

be-marc

Super classes

mlr3::Learner -> mlr3proba::LearnerSurv -> LearnerSurvXgboost

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method importance()

The importance scores are calculated with xgboost::xgb.importance().

Usage

LearnerSurvXgboost$importance()

Returns

Named numeric().


Method clone()

The objects of this class are cloneable with this method.

Usage

LearnerSurvXgboost$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

learner = mlr3::lrn("surv.xgboost")
#> Warning: 'surv.xgboost' will be deprecated in the future. Use 'surv.xgboost.cox' or 'surv.xgboost.aft' learners instead.
print(learner)
#> <LearnerSurvXgboost:surv.xgboost>: Gradient Boosting
#> * Model: -
#> * Parameters: nrounds=1, nthread=1, verbose=0, early_stopping_set=none
#> * Packages: mlr3, mlr3proba, mlr3extralearners, xgboost
#> * Predict Types:  [crank], lp
#> * Feature Types: integer, numeric
#> * Properties: importance, missings, weights

# available parameters:
learner$param_set$ids()
#>  [1] "aft_loss_distribution"       "aft_loss_distribution_scale"
#>  [3] "alpha"                       "base_score"                 
#>  [5] "booster"                     "callbacks"                  
#>  [7] "colsample_bylevel"           "colsample_bynode"           
#>  [9] "colsample_bytree"            "disable_default_eval_metric"
#> [11] "early_stopping_rounds"       "early_stopping_set"         
#> [13] "eta"                         "feature_selector"           
#> [15] "feval"                       "gamma"                      
#> [17] "grow_policy"                 "interaction_constraints"    
#> [19] "iterationrange"              "lambda"                     
#> [21] "lambda_bias"                 "max_bin"                    
#> [23] "max_delta_step"              "max_depth"                  
#> [25] "max_leaves"                  "maximize"                   
#> [27] "min_child_weight"            "missing"                    
#> [29] "monotone_constraints"        "normalize_type"             
#> [31] "nrounds"                     "nthread"                    
#> [33] "num_parallel_tree"           "objective"                  
#> [35] "one_drop"                    "print_every_n"              
#> [37] "process_type"                "rate_drop"                  
#> [39] "refresh_leaf"                "sampling_method"            
#> [41] "sample_type"                 "save_name"                  
#> [43] "save_period"                 "scale_pos_weight"           
#> [45] "seed_per_iteration"          "skip_drop"                  
#> [47] "strict_shape"                "subsample"                  
#> [49] "top_k"                       "tree_method"                
#> [51] "tweedie_variance_power"      "updater"                    
#> [53] "verbose"                     "watchlist"                  
#> [55] "xgb_model"                   "device"