Skip to contents

Gradient boosting algorithm. Calls lightgbm::lightgbm() from lightgbm. The list of parameters can be found here and in the documentation of lightgbm::lgb.train(). Note that lightgbm models have to be saved using lightgbm::lgb.save, so you cannot simpliy save the learner using saveRDS. This will change in future versions of lightgbm.

Dictionary

This Learner can be instantiated via the dictionary mlr_learners or with the associated sugar function lrn():

mlr_learners$get("regr.lightgbm")
lrn("regr.lightgbm")

Meta Information

  • Task type: “regr”

  • Predict Types: “response”

  • Feature Types: “logical”, “integer”, “numeric”, “factor”

  • Required Packages: mlr3, mlr3extralearners, lightgbm

Parameters

IdTypeDefaultLevelsRange
num_iterationsinteger100\([0, \infty)\)
objectivecharacterregressionregression, regression_l1, huber, fair, poisson, quantile, mape, gamma, tweedie-
evaluntyped--
verboseinteger1\((-\infty, \infty)\)
recordlogicalTRUETRUE, FALSE-
eval_freqinteger1\([1, \infty)\)
early_stopping_roundsinteger-\([1, \infty)\)
early_stoppinglogicalFALSETRUE, FALSE-
callbacksuntyped--
reset_datalogicalFALSETRUE, FALSE-
categorical_featureuntyped-
convert_categoricallogicalTRUETRUE, FALSE-
boostingcharactergbdtgbdt, rf, dart, goss-
linear_treelogicalFALSETRUE, FALSE-
learning_ratenumeric0.1\([0, \infty)\)
num_leavesinteger31\([1, 131072]\)
tree_learnercharacterserialserial, feature, data, voting-
num_threadsinteger0\([0, \infty)\)
device_typecharactercpucpu, gpu-
seedinteger-\((-\infty, \infty)\)
deterministiclogicalFALSETRUE, FALSE-
data_sample_strategycharacterbaggingbagging, goss-
force_col_wiselogicalFALSETRUE, FALSE-
force_row_wiselogicalFALSETRUE, FALSE-
histogram_pool_sizeinteger-1\((-\infty, \infty)\)
max_depthinteger-1\((-\infty, \infty)\)
min_data_in_leafinteger20\([0, \infty)\)
min_sum_hessian_in_leafnumeric0.001\([0, \infty)\)
bagging_fractionnumeric1\([0, 1]\)
bagging_freqinteger0\([0, \infty)\)
bagging_seedinteger3\((-\infty, \infty)\)
feature_fractionnumeric1\([0, 1]\)
feature_fraction_bynodenumeric1\([0, 1]\)
feature_fraction_seedinteger2\((-\infty, \infty)\)
extra_treeslogicalFALSETRUE, FALSE-
extra_seedinteger6\((-\infty, \infty)\)
first_metric_onlylogicalFALSETRUE, FALSE-
max_delta_stepnumeric0\((-\infty, \infty)\)
lambda_l1numeric0\([0, \infty)\)
lambda_l2numeric0\([0, \infty)\)
linear_lambdanumeric0\([0, \infty)\)
min_gain_to_splitnumeric0\([0, \infty)\)
drop_ratenumeric0.1\([0, 1]\)
max_dropinteger50\((-\infty, \infty)\)
skip_dropnumeric0.5\([0, 1]\)
xgboost_dart_modelogicalFALSETRUE, FALSE-
uniform_droplogicalFALSETRUE, FALSE-
drop_seedinteger4\((-\infty, \infty)\)
top_ratenumeric0.2\([0, 1]\)
other_ratenumeric0.1\([0, 1]\)
min_data_per_groupinteger100\([1, \infty)\)
max_cat_thresholdinteger32\([1, \infty)\)
cat_l2numeric10\([0, \infty)\)
cat_smoothnumeric10\([0, \infty)\)
max_cat_to_onehotinteger4\([1, \infty)\)
top_kinteger20\([1, \infty)\)
monotone_constraintsuntyped-
monotone_constraints_methodcharacterbasicbasic, intermediate, advanced-
monotone_penaltynumeric0\([0, \infty)\)
feature_contriuntyped-
forcedsplits_filenameuntyped-
refit_decay_ratenumeric0.9\([0, 1]\)
cegb_tradeoffnumeric1\([0, \infty)\)
cegb_penalty_splitnumeric0\([0, \infty)\)
cegb_penalty_feature_lazyuntyped--
cegb_penalty_feature_coupleduntyped--
path_smoothnumeric0\([0, \infty)\)
interaction_constraintsuntyped--
use_quantized_gradlogicalTRUETRUE, FALSE-
num_grad_quant_binsinteger4\((-\infty, \infty)\)
quant_train_renew_leaflogicalFALSETRUE, FALSE-
stochastic_roundinglogicalTRUETRUE, FALSE-
max_bininteger255\([2, \infty)\)
max_bin_by_featureuntyped-
min_data_in_bininteger3\([1, \infty)\)
bin_construct_sample_cntinteger200000\([1, \infty)\)
data_random_seedinteger1\((-\infty, \infty)\)
is_enable_sparselogicalTRUETRUE, FALSE-
enable_bundlelogicalTRUETRUE, FALSE-
use_missinglogicalTRUETRUE, FALSE-
zero_as_missinglogicalFALSETRUE, FALSE-
feature_pre_filterlogicalTRUETRUE, FALSE-
pre_partitionlogicalFALSETRUE, FALSE-
two_roundlogicalFALSETRUE, FALSE-
forcedbins_filenameuntyped-
boost_from_averagelogicalTRUETRUE, FALSE-
reg_sqrtlogicalFALSETRUE, FALSE-
alphanumeric0.9\([0, \infty)\)
fair_cnumeric1\([0, \infty)\)
poisson_max_delta_stepnumeric0.7\([0, \infty)\)
tweedie_variance_powernumeric1.5\([1, 2]\)
metric_freqinteger1\([1, \infty)\)
num_machinesinteger1\([1, \infty)\)
local_listen_portinteger12400\([1, \infty)\)
time_outinteger120\([1, \infty)\)
machinesuntyped-
gpu_platform_idinteger-1\((-\infty, \infty)\)
gpu_device_idinteger-1\((-\infty, \infty)\)
gpu_use_dplogicalFALSETRUE, FALSE-
num_gpuinteger1\([1, \infty)\)
start_iteration_predictinteger0\((-\infty, \infty)\)
num_iteration_predictinteger-1\((-\infty, \infty)\)
pred_early_stoplogicalFALSETRUE, FALSE-
pred_early_stop_freqinteger10\((-\infty, \infty)\)
pred_early_stop_marginnumeric10\((-\infty, \infty)\)

Initial parameter values

  • num_threads:

    • Actual default: 0L

    • Initital value: 1L

    • Reason for change: Prevents accidental conflicts with future.

  • verbose:

    • Actual default: 1L

    • Initial value: -1L

    • Reason for change: Prevents accidental conflicts with mlr messaging system.

Custom mlr3 parameters

  • early_stopping Whether to use the test set for early stopping. Default is FALSE.

  • convert_categorical: Additional parameter. If this parameter is set to TRUE (default), all factor and logical columns are converted to integers and the parameter categorical_feature of lightgbm is set to those columns.

References

Ke, Guolin, Meng, Qi, Finley, Thomas, Wang, Taifeng, Chen, Wei, Ma, Weidong, Ye, Qiwei, Liu, Tie-Yan (2017). “Lightgbm: A highly efficient gradient boosting decision tree.” Advances in neural information processing systems, 30.

See also

Author

kapsner

Super classes

mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrLightGBM

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method importance()

The importance scores are extracted from lbg.importance.

Usage

LearnerRegrLightGBM$importance()

Returns

Named numeric().


Method clone()

The objects of this class are cloneable with this method.

Usage

LearnerRegrLightGBM$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

learner = mlr3::lrn("regr.lightgbm")
print(learner)
#> <LearnerRegrLightGBM:regr.lightgbm>: Gradient Boosting
#> * Model: -
#> * Parameters: num_threads=1, verbose=-1, objective=regression,
#>   convert_categorical=TRUE
#> * Packages: mlr3, mlr3extralearners, lightgbm
#> * Predict Types:  [response]
#> * Feature Types: logical, integer, numeric, factor
#> * Properties: hotstart_forward, importance, missings, weights

# available parameters:
learner$param_set$ids()
#>   [1] "num_iterations"               "objective"                   
#>   [3] "eval"                         "verbose"                     
#>   [5] "record"                       "eval_freq"                   
#>   [7] "early_stopping_rounds"        "early_stopping"              
#>   [9] "callbacks"                    "reset_data"                  
#>  [11] "categorical_feature"          "convert_categorical"         
#>  [13] "boosting"                     "linear_tree"                 
#>  [15] "learning_rate"                "num_leaves"                  
#>  [17] "tree_learner"                 "num_threads"                 
#>  [19] "device_type"                  "seed"                        
#>  [21] "deterministic"                "data_sample_strategy"        
#>  [23] "force_col_wise"               "force_row_wise"              
#>  [25] "histogram_pool_size"          "max_depth"                   
#>  [27] "min_data_in_leaf"             "min_sum_hessian_in_leaf"     
#>  [29] "bagging_fraction"             "bagging_freq"                
#>  [31] "bagging_seed"                 "feature_fraction"            
#>  [33] "feature_fraction_bynode"      "feature_fraction_seed"       
#>  [35] "extra_trees"                  "extra_seed"                  
#>  [37] "first_metric_only"            "max_delta_step"              
#>  [39] "lambda_l1"                    "lambda_l2"                   
#>  [41] "linear_lambda"                "min_gain_to_split"           
#>  [43] "drop_rate"                    "max_drop"                    
#>  [45] "skip_drop"                    "xgboost_dart_mode"           
#>  [47] "uniform_drop"                 "drop_seed"                   
#>  [49] "top_rate"                     "other_rate"                  
#>  [51] "min_data_per_group"           "max_cat_threshold"           
#>  [53] "cat_l2"                       "cat_smooth"                  
#>  [55] "max_cat_to_onehot"            "top_k"                       
#>  [57] "monotone_constraints"         "monotone_constraints_method" 
#>  [59] "monotone_penalty"             "feature_contri"              
#>  [61] "forcedsplits_filename"        "refit_decay_rate"            
#>  [63] "cegb_tradeoff"                "cegb_penalty_split"          
#>  [65] "cegb_penalty_feature_lazy"    "cegb_penalty_feature_coupled"
#>  [67] "path_smooth"                  "interaction_constraints"     
#>  [69] "use_quantized_grad"           "num_grad_quant_bins"         
#>  [71] "quant_train_renew_leaf"       "stochastic_rounding"         
#>  [73] "max_bin"                      "max_bin_by_feature"          
#>  [75] "min_data_in_bin"              "bin_construct_sample_cnt"    
#>  [77] "data_random_seed"             "is_enable_sparse"            
#>  [79] "enable_bundle"                "use_missing"                 
#>  [81] "zero_as_missing"              "feature_pre_filter"          
#>  [83] "pre_partition"                "two_round"                   
#>  [85] "forcedbins_filename"          "boost_from_average"          
#>  [87] "reg_sqrt"                     "alpha"                       
#>  [89] "fair_c"                       "poisson_max_delta_step"      
#>  [91] "tweedie_variance_power"       "metric_freq"                 
#>  [93] "num_machines"                 "local_listen_port"           
#>  [95] "time_out"                     "machines"                    
#>  [97] "gpu_platform_id"              "gpu_device_id"               
#>  [99] "gpu_use_dp"                   "num_gpu"                     
#> [101] "start_iteration_predict"      "num_iteration_predict"       
#> [103] "pred_early_stop"              "pred_early_stop_freq"        
#> [105] "pred_early_stop_margin"