xboost

Version:

1.5.0

Category:

lib

Cluster:

Loki

Author / Distributor

https://xgboost.ai/

Description

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable. It implements machine learning algorithms under the Gradient Boosting framework and supports classification, regression, and ranking.

Version 1.5.0 offers:

  • GPU acceleration with CUDA 10.2

  • Native support for NumPy, Pandas, DMatrix

  • Improved JSON-based model format

  • Experimental support for categorical data

  • Integration with scikit-learn and cross-validation utilities

This module uses Python 3.7, CUDA 10.2, and GCC 8 for compatibility with legacy GPU-based workflows on Loki.

Documentation

usage: xgboost.train(params, dtrain, num_boost_round=10, evals=(), ...)

Common Parameters:
  booster: [gbtree, gblinear, dart]
  objective: [reg:squarederror, binary:logistic, multi:softmax, ...]
  eval_metric: [rmse, mae, logloss, error, merror, ...]
  tree_method: [auto, exact, approx, hist, gpu_hist]
  max_depth: Maximum depth of tree
  eta: Step size shrinkage
  subsample: Row sampling rate
  colsample_bytree: Column sampling rate per tree

CLI alternative:
  $ xgboost train.conf

For Python help:
  >>> import xgboost as xgb
  >>> help(xgb)

Examples/Usage

  • Load the module:

$ module load xgboost-py37-cuda10.2-gcc8/1.5.0
  • Python usage:

import xgboost as xgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target)

dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

params = {
    "objective": "binary:logistic",
    "tree_method": "gpu_hist",
    "eval_metric": "logloss"
}

bst = xgb.train(params, dtrain, num_boost_round=50, evals=[(dtest, "eval")])
  • Unload the module:

$ module unload xgboost-py37-cuda10.2-gcc8/1.5.0

Installation

Source code is obtained from XGBoost