cuda11.2
- Version:
11.2.0
- Category:
tools
- Cluster:
Loki
Description
The NVIDIA CUDA® Toolkit 11.2 provides a robust development environment for creating GPU-accelerated applications using C, C++, and Fortran.
New in CUDA 11.2:
Faster compilation with nvcc using Link Time Optimization (LTO)
Expanded CUDA Graph APIs
Enhanced support for the Ampere architecture (e.g., A100 GPUs)
New library versions: - cuBLAS 11.4 - cuFFT 10.4 - cuSPARSE 11.3 - NCCL 2.8+
CUDA 11.2 continues to support:
Volta, Turing, and Pascal GPUs
Compatibility with multiple GCC versions (up to GCC 9.x)
Available modules:
cuda11.2/toolkit/11.2.0 Core compiler (nvcc), CUDA runtime, Nsight tools, and development headers
cuda11.2/blas/11.2.0 cuBLAS 11.4 — optimized linear algebra routines for dense matrix operations
cuda11.2/fft/11.2.0 cuFFT 10.4 — GPU-accelerated Fast Fourier Transform routines
Documentation
Examples/Usage
Load the core CUDA module:
$ module load cuda11.2/toolkit/11.2.0
Compile a program targeting Ampere:
$ nvcc -arch=sm_80 my_kernel.cu -o my_kernel
Run the program:
$ ./my_kernel
Load additional libraries if needed:
$ module load cuda11.2/blas/11.2.0
$ module load cuda11.2/fft/11.2.0
Check your environment configuration:
$ echo $CUDA_HOME
$ echo $PATH
$ echo $LD_LIBRARY_PATH
Unload all modules:
$ module unload cuda11.2/toolkit/11.2.0
$ module unload cuda11.2/blas/11.2.0
$ module unload cuda11.2/fft/11.2.0
Installation
The CUDA 11.2 Toolkit was installed from: https://developer.nvidia.com/cuda-11.2-download-archive