Generalized linear classification model.
Calls h2o::h2o.glm() from package h2o with family always set to "binomial".
H2O Connection
If no running H2O connection is found, the learner will automatically start a local H2O server
on 127.0.0.1 via h2o::h2o.init().
If you want to connect to a remote H2O cluster, call h2o::h2o.init() with the appropriate
arguments before training or predicting.
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3extralearners, h2o
Parameters
| Id | Type | Default | Levels | Range |
| alpha | numeric | 0.5 | \([0, 1]\) | |
| balance_classes | logical | FALSE | TRUE, FALSE | - |
| beta_constraints | untyped | NULL | - | |
| beta_epsilon | numeric | 1e-04 | \([0, \infty)\) | |
| build_null_model | logical | FALSE | TRUE, FALSE | - |
| calc_like | logical | FALSE | TRUE, FALSE | - |
| checkpoint | untyped | NULL | - | |
| class_sampling_factors | untyped | NULL | - | |
| cold_start | logical | FALSE | TRUE, FALSE | - |
| compute_p_values | logical | FALSE | TRUE, FALSE | - |
| early_stopping | logical | TRUE | TRUE, FALSE | - |
| export_checkpoints_dir | untyped | NULL | - | |
| gainslift_bins | integer | -1 | \([-1, \infty)\) | |
| generate_scoring_history | logical | FALSE | TRUE, FALSE | - |
| generate_variable_inflation_factors | logical | FALSE | TRUE, FALSE | - |
| gradient_epsilon | numeric | -1 | \([0, \infty)\) | |
| HGLM | logical | FALSE | TRUE, FALSE | - |
| ignore_const_cols | logical | TRUE | TRUE, FALSE | - |
| interactions | untyped | NULL | - | |
| interaction_pairs | untyped | NULL | - | |
| intercept | logical | TRUE | TRUE, FALSE | - |
| lambda | numeric | 1e-05 | \([0, \infty)\) | |
| lambda_min_ratio | numeric | -1 | \([0, 1]\) | |
| lambda_search | logical | FALSE | TRUE, FALSE | - |
| link | character | logit | family_default, logit | - |
| max_active_predictors | integer | -1 | \([1, \infty)\) | |
| max_after_balance_size | numeric | 5 | \([0, \infty)\) | |
| max_iterations | integer | -1 | \([0, \infty)\) | |
| max_runtime_secs | numeric | 0 | \([0, \infty)\) | |
| missing_values_handling | character | MeanImputation | MeanImputation, Skip, PlugValues | - |
| nlambdas | integer | -1 | \([1, \infty)\) | |
| non_negative | logical | FALSE | TRUE, FALSE | - |
| objective_epsilon | numeric | -1 | \([0, \infty)\) | |
| obj_reg | numeric | -1 | \([0, \infty)\) | |
| plug_values | untyped | NULL | - | |
| prior | numeric | -1 | \([0, \infty)\) | |
| random_columns | untyped | NULL | - | |
| remove_collinear_columns | logical | FALSE | TRUE, FALSE | - |
| score_each_iteration | logical | FALSE | TRUE, FALSE | - |
| score_iteration_interval | integer | -1 | \((-\infty, \infty)\) | |
| seed | integer | -1 | \((-\infty, \infty)\) | |
| solver | character | AUTO | AUTO, IRLSM, L_BFGS, COORDINATE_DESCENT, COORDINATE_DESCENT_NAIVE | - |
| standardize | logical | TRUE | TRUE, FALSE | - |
| startval | untyped | NULL | - | |
| stopping_metric | character | AUTO | AUTO, logloss, AUC, AUCPR, lift_top_group, misclassification, mean_per_class_error | - |
| stopping_rounds | integer | 0 | \([0, \infty)\) | |
| stopping_tolerance | numeric | 0.001 | \([0, \infty)\) |
References
Fryda T, LeDell E, Gill N, Aiello S, Fu A, Candel A, Click C, Kraljevic T, Nykodym T, Aboyoun P, Kurka M, Malohlava M, Poirier S, Wong W (2025). h2o: R Interface for the 'H2O' Scalable Machine Learning Platform. R package version 3.46.0.9, https://github.com/h2oai/h2o-3.
See also
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Super classes
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifH2OGLM
Methods
Inherited methods
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()
Examples
# Define the Learner
learner = lrn("classif.h2o.glm")
print(learner)
#>
#> ── <LearnerClassifH2OGLM> (classif.h2o.glm): H2O GLM ───────────────────────────
#> • Model: -
#> • Parameters: list()
#> • Packages: mlr3, mlr3extralearners, and h2o
#> • Predict Types: [response] and prob
#> • Feature Types: integer, numeric, and factor
#> • Encapsulation: none (fallback: -)
#> • Properties: missings, twoclass, and weights
#> • Other settings: use_weights = 'use'
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
#> Model Details:
#> ==============
#>
#> H2OBinomialModel: glm
#> Model ID: GLM_model_R_1774260318250_56
#> GLM Model: summary
#> family link regularization
#> 1 binomial logit Elastic Net (alpha = 0.5, lambda = 0.04617 )
#> number_of_predictors_total number_of_active_predictors number_of_iterations
#> 1 60 30 7
#> training_frame
#> 1 data_sid_bb94_11
#>
#> Coefficients: glm coefficients
#> names coefficients standardized_coefficients
#> 1 Intercept 4.584945 -0.270430
#> 2 V1 -7.378692 -0.153974
#> 3 V10 -0.162476 -0.022048
#> 4 V11 -3.264438 -0.441835
#> 5 V12 -3.433204 -0.482228
#>
#> ---
#> names coefficients standardized_coefficients
#> 56 V59 0.000000 0.000000
#> 57 V6 0.000000 0.000000
#> 58 V60 0.000000 0.000000
#> 59 V7 0.000000 0.000000
#> 60 V8 0.872680 0.071011
#> 61 V9 0.000000 0.000000
#>
#> H2OBinomialMetrics: glm
#> ** Reported on training data. **
#>
#> MSE: 0.1148664
#> RMSE: 0.3389195
#> LogLoss: 0.3746031
#> Mean Per-Class Error: 0.1505198
#> AUC: 0.9345114
#> AUCPR: 0.9353997
#> Gini: 0.8690229
#> R^2: 0.5385999
#> Residual Deviance: 104.1397
#> AIC: 166.1397
#>
#> Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
#> M R Error Rate
#> M 54 20 0.270270 =20/74
#> R 2 63 0.030769 =2/65
#> Totals 56 83 0.158273 =22/139
#>
#> Maximum Metrics: Maximum metrics at their respective thresholds
#> metric threshold value idx
#> 1 max f1 0.369332 0.851351 82
#> 2 max f2 0.369332 0.918367 82
#> 3 max f0point5 0.633329 0.903614 45
#> 4 max accuracy 0.633329 0.848921 45
#> 5 max precision 0.945882 1.000000 0
#> 6 max recall 0.139631 1.000000 115
#> 7 max specificity 0.945882 1.000000 0
#> 8 max absolute_mcc 0.633329 0.719764 45
#> 9 max min_per_class_accuracy 0.466428 0.824324 66
#> 10 max mean_per_class_accuracy 0.369332 0.849480 82
#> 11 max tns 0.945882 74.000000 0
#> 12 max fns 0.945882 64.000000 0
#> 13 max fps 0.003749 74.000000 138
#> 14 max tps 0.139631 65.000000 115
#> 15 max tnr 0.945882 1.000000 0
#> 16 max fnr 0.945882 0.984615 0
#> 17 max fpr 0.003749 1.000000 138
#> 18 max tpr 0.139631 1.000000 115
#>
#> Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
#>
#>
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.2753623