Boosted Generalized Linear Classification Learner
mlr_learners_classif.glmboost.Rd
Fit a generalized linear classification model using a boosting algorithm.
Calls mboost::glmboost()
from mboost.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3extralearners, mboost
Parameters
Id | Type | Default | Levels | Range |
offset | numeric | NULL | \((-\infty, \infty)\) | |
family | character | Binomial | Binomial, AdaExp, AUC, custom | - |
custom.family | untyped | - | - | |
link | character | logit | logit, probit | - |
type | character | adaboost | glm, adaboost | - |
center | logical | TRUE | TRUE, FALSE | - |
mstop | integer | 100 | \((-\infty, \infty)\) | |
nu | numeric | 0.1 | \((-\infty, \infty)\) | |
risk | character | inbag | inbag, oobag, none | - |
oobweights | untyped | NULL | - | |
trace | logical | FALSE | TRUE, FALSE | - |
stopintern | untyped | FALSE | - | |
na.action | untyped | stats::na.omit | - | |
contrasts.arg | untyped | - | - |
References
Bühlmann, Peter, Yu, Bin (2003). “Boosting with the L 2 loss: regression and classification.” Journal of the American Statistical Association, 98(462), 324–339.
See also
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifGLMBoost
Examples
# Define the Learner
learner = mlr3::lrn("classif.glmboost")
print(learner)
#> <LearnerClassifGLMBoost:classif.glmboost>: Boosted Generalized Linear Model
#> * Model: -
#> * Parameters: list()
#> * Packages: mlr3, mlr3extralearners, mboost
#> * Predict Types: [response], prob
#> * Feature Types: integer, numeric, factor, ordered
#> * Properties: twoclass, weights
# Define a Task
task = mlr3::tsk("sonar")
# Create train and test set
ids = mlr3::partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
#>
#> Generalized Linear Models Fitted via Gradient Boosting
#>
#> Call:
#> glmboost.formula(formula = f, data = data, family = new("boost_family_glm", fW = function (f) { f <- pmin(abs(f), 36) * sign(f) p <- exp(f)/(exp(f) + exp(-f)) 4 * p * (1 - p) }, ngradient = function (y, f, w = 1) { exp2yf <- exp(-2 * y * f) -(-2 * y * exp2yf)/(log(2) * (1 + exp2yf)) }, risk = function (y, f, w = 1) sum(w * loss(y, f), na.rm = TRUE), offset = function (y, w) { p <- weighted.mean(y > 0, w) 1/2 * log(p/(1 - p)) }, check_y = function (y) { if (!is.factor(y)) stop("response is not a factor but ", sQuote("family = Binomial()")) if (nlevels(y) != 2) stop("response is not a factor at two levels but ", sQuote("family = Binomial()")) return(c(-1, 1)[as.integer(y)]) }, weights = function (w) { switch(weights, any = TRUE, none = isTRUE(all.equal(unique(w), 1)), zeroone = isTRUE(all.equal(unique(w + abs(w - 1)), 1)), case = isTRUE(all.equal(unique(w - floor(w)), 0))) }, nuisance = function () return(NA), response = function (f) { f <- pmin(abs(f), 36) * sign(f) p <- exp(f)/(exp(f) + exp(-f)) return(p) }, rclass = function (f) (f > 0) + 1, name = "Negative Binomial Likelihood (logit link)", charloss = c("{ \n", " f <- pmin(abs(f), 36) * sign(f) \n", " p <- exp(f)/(exp(f) + exp(-f)) \n", " y <- (y + 1)/2 \n", " -y * log(p) - (1 - y) * log(1 - p) \n", "} \n")), control = ctrl)
#>
#>
#> Negative Binomial Likelihood (logit link)
#>
#> Loss function: {
#> f <- pmin(abs(f), 36) * sign(f)
#> p <- exp(f)/(exp(f) + exp(-f))
#> y <- (y + 1)/2
#> -y * log(p) - (1 - y) * log(1 - p)
#> }
#>
#>
#> Number of boosting iterations: mstop = 100
#> Step size: 0.1
#> Offset: -0.1375516
#>
#> Coefficients:
#>
#> NOTE: Coefficients from a Binomial model are half the size of coefficients
#> from a model fitted via glm(... , family = 'binomial').
#> See Warning section in ?coef.mboost
#> (Intercept) V1 V11 V12 V15 V16
#> 1.44850329 -1.61973244 -1.98614906 -0.51676485 0.27994821 0.24371062
#> V20 V21 V22 V23 V31 V36
#> -0.12385778 -0.05242839 -0.57733875 -0.16141872 0.21417983 0.97756917
#> V4 V40 V43 V45 V47 V48
#> -1.60584671 0.29314894 -0.84063438 -1.29701959 -0.39839278 -2.94077397
#> V49 V52 V55 V56
#> -0.82349669 -6.25062747 3.12775491 1.87245764
#> attr(,"offset")
#> [1] -0.1375516
#>
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.2028986