tensorrt

Version:: 8.0.3.4, 7.0.0.11
Category:: ai
Cluster:: Loki

Author / Distributor

https://developer.nvidia.com/tensorrt

Description

NVIDIA TensorRT is a high-performance deep learning inference SDK for deploying AI models on NVIDIA GPUs. It supports model optimization, quantization, and deployment from popular frameworks such as TensorFlow, PyTorch, and ONNX.

TensorRT 8.0.3.4 features:

Highly optimized INT8 and FP16 inference
ONNX and native parser support
Multi-stream execution
Layer and kernel fusion
CUDA 10.2 compatibility for legacy GPU environments

TensorRT accelerates models for image classification, segmentation, object detection, and language modeling.

Documentation

tensorrt is typically used via Python or C++ APIs.

Python Example:
----------------
>>> import tensorrt as trt
>>> logger = trt.Logger(trt.Logger.WARNING)
>>> builder = trt.Builder(logger)
>>> print(trt.__version__)
'8.0.3'

Command-line tools:
-------------------
trtexec --onnx=model.onnx --explicitBatch --saveEngine=model.engine
polygraphy run model.onnx --onnxrt --trt

Help:
  $ trtexec --help
  $ polygraphy --help

Examples/Usage

Load the module:

$ module load tensorrt-cuda10.2/8.0.3.4

Run trtexec on an ONNX model:

$ trtexec --onnx=model.onnx --saveEngine=model.engine --explicitBatch

Python API usage:

import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
builder = trt.Builder(TRT_LOGGER)
print("TensorRT version:", trt.__version__)

Inspect TensorRT tools:

$ which trtexec
$ trtexec --help

Unload the module:

$ module unload tensorrt-cuda10.2/8.0.3.4

Installation

Source code is obtained from TensorRT