Boosted Generalized Linear Classification Learner
mlr_learners_classif.glmboost.Rd
Fit a generalized linear classification model using a boosting algorithm.
Calls mboost::glmboost()
from mboost.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, mboost
Parameters
Id | Type | Default | Levels | Range |
family | character | Binomial | Binomial, AdaExp, AUC, custom | - |
custom.family | untyped | - | - | |
link | character | logit | logit, probit | - |
type | character | adaboost | glm, adaboost | - |
center | logical | TRUE | TRUE, FALSE | - |
mstop | integer | 100 | \((-\infty, \infty)\) | |
nu | numeric | 0.1 | \((-\infty, \infty)\) | |
risk | character | inbag | inbag, oobag, none | - |
oobweights | untyped | NULL | - | |
trace | logical | FALSE | TRUE, FALSE | - |
stopintern | untyped | FALSE | - | |
na.action | untyped | stats::na.omit | - | |
contrasts.arg | untyped | - | - |
Offset
If a Task
contains a column with the offset
role, it is automatically
incorporated via the offset
argument in mboost
's training function.
No offset is applied during prediction for this learner.
References
Bühlmann, Peter, Yu, Bin (2003). “Boosting with the L 2 loss: regression and classification.” Journal of the American Statistical Association, 98(462), 324–339.
See also
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifGLMBoost
Examples
# Define the Learner
learner = mlr3::lrn("classif.glmboost")
print(learner)
#> <LearnerClassifGLMBoost:classif.glmboost>: Boosted Generalized Linear Model
#> * Model: -
#> * Parameters: list()
#> * Packages: mlr3, mlr3extralearners, mboost
#> * Predict Types: [response], prob
#> * Feature Types: integer, numeric, factor, ordered
#> * Properties: offset, twoclass, weights
# Define a Task
task = mlr3::tsk("sonar")
# Create train and test set
ids = mlr3::partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
#>
#> Generalized Linear Models Fitted via Gradient Boosting
#>
#> Call:
#> glmboost.formula(formula = f, data = data, family = new("boost_family_glm", fW = function (f) { f <- pmin(abs(f), 36) * sign(f) p <- exp(f)/(exp(f) + exp(-f)) 4 * p * (1 - p) }, ngradient = function (y, f, w = 1) { exp2yf <- exp(-2 * y * f) -(-2 * y * exp2yf)/(log(2) * (1 + exp2yf)) }, risk = function (y, f, w = 1) sum(w * loss(y, f), na.rm = TRUE), offset = function (y, w) { p <- weighted.mean(y > 0, w) 1/2 * log(p/(1 - p)) }, check_y = function (y) { if (!is.factor(y)) stop("response is not a factor but ", sQuote("family = Binomial()")) if (nlevels(y) != 2) stop("response is not a factor at two levels but ", sQuote("family = Binomial()")) return(c(-1, 1)[as.integer(y)]) }, weights = function (w) { switch(weights, any = TRUE, none = isTRUE(all.equal(unique(w), 1)), zeroone = isTRUE(all.equal(unique(w + abs(w - 1)), 1)), case = isTRUE(all.equal(unique(w - floor(w)), 0))) }, nuisance = function () return(NA), response = function (f) { f <- pmin(abs(f), 36) * sign(f) p <- exp(f)/(exp(f) + exp(-f)) return(p) }, rclass = function (f) (f > 0) + 1, name = "Negative Binomial Likelihood (logit link)", charloss = c("{ \n", " f <- pmin(abs(f), 36) * sign(f) \n", " p <- exp(f)/(exp(f) + exp(-f)) \n", " y <- (y + 1)/2 \n", " -y * log(p) - (1 - y) * log(1 - p) \n", "} \n")), control = ctrl)
#>
#>
#> Negative Binomial Likelihood (logit link)
#>
#> Loss function: {
#> f <- pmin(abs(f), 36) * sign(f)
#> p <- exp(f)/(exp(f) + exp(-f))
#> y <- (y + 1)/2
#> -y * log(p) - (1 - y) * log(1 - p)
#> }
#>
#>
#> Number of boosting iterations: mstop = 100
#> Step size: 0.1
#> Offset: -0.02158609
#>
#> Coefficients:
#>
#> NOTE: Coefficients from a Binomial model are half the size of coefficients
#> from a model fitted via glm(... , family = 'binomial').
#> See Warning section in ?coef.mboost
#> (Intercept) V1 V11 V16 V21 V22
#> 1.28675739 -3.05578860 -1.04950909 0.34362416 -0.17475105 -0.44095854
#> V27 V28 V29 V31 V36 V39
#> -0.18791429 -0.10631261 -0.09382121 0.48052856 1.01850540 -0.17606440
#> V4 V43 V44 V45 V48 V49
#> -1.90446822 -0.35156219 -0.27965940 -1.54828687 -0.91865122 -2.14748741
#> V51 V52 V55 V59 V60 V7
#> -6.10238471 -5.84176563 4.36443620 -1.62276103 -4.48769576 0.15926427
#> V9
#> -0.39427663
#> attr(,"offset")
#> [1] -0.02158609
#>
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.1884058