xboost
- Version:
1.5.0
- Category:
lib
- Cluster:
Loki
Description
XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable. It implements machine learning algorithms under the Gradient Boosting framework and supports classification, regression, and ranking.
Version 1.5.0 offers:
GPU acceleration with CUDA 10.2
Native support for NumPy, Pandas, DMatrix
Improved JSON-based model format
Experimental support for categorical data
Integration with scikit-learn and cross-validation utilities
This module uses Python 3.7, CUDA 10.2, and GCC 8 for compatibility with legacy GPU-based workflows on Loki.
Documentation
usage: xgboost.train(params, dtrain, num_boost_round=10, evals=(), ...)
Common Parameters:
booster: [gbtree, gblinear, dart]
objective: [reg:squarederror, binary:logistic, multi:softmax, ...]
eval_metric: [rmse, mae, logloss, error, merror, ...]
tree_method: [auto, exact, approx, hist, gpu_hist]
max_depth: Maximum depth of tree
eta: Step size shrinkage
subsample: Row sampling rate
colsample_bytree: Column sampling rate per tree
CLI alternative:
$ xgboost train.conf
For Python help:
>>> import xgboost as xgb
>>> help(xgb)
Examples/Usage
Load the module:
$ module load xgboost-py37-cuda10.2-gcc8/1.5.0
Python usage:
import xgboost as xgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
params = {
"objective": "binary:logistic",
"tree_method": "gpu_hist",
"eval_metric": "logloss"
}
bst = xgb.train(params, dtrain, num_boost_round=50, evals=[(dtest, "eval")])
Unload the module:
$ module unload xgboost-py37-cuda10.2-gcc8/1.5.0
Installation
Source code is obtained from XGBoost