cuda12.0
- Version:
12.0.1
- Category:
tools
- Cluster:
Loki
Description
The NVIDIA CUDA® Toolkit 12.0 is the first release in the CUDA 12.x series, introducing next-generation compiler infrastructure and expanded hardware support for modern GPUs, including Hopper (H100).
Available modules:
cuda12.0/toolkit/12.0.1 Core development tools (nvcc, runtime libraries, headers, Nsight profilers)
cuda12.0/blas/12.0.1 cuBLAS 12.0 — updated dense linear algebra library
cuda12.0/fft/12.0.1 cuFFT 11.0 — high-performance Fast Fourier Transform routines
Key features in CUDA 12.0:
Initial support for Hopper architecture (sm_90)
NVVM IR v2 and enhanced Link Time Optimization (LTO)
Improved support for Multi-Process Service (MPS)
Compatibility with newer Linux distros and GCC versions
Library updates: - cuBLAS 12.0 - cuFFT 11.0 - cuSPARSE 12.0
Enhanced interoperability with OpenACC and OpenMP offloading
CUDA 12.0 maintains backward compatibility with previous GPU generations (Ampere, Turing, Volta, Pascal).
Documentation
Examples/Usage
Load the CUDA 12.0 toolkit:
$ module load cuda12.0/toolkit/12.0.1
Compile a program targeting Hopper:
$ nvcc -arch=sm_90 matrixMul.cu -o matrixMul
Run your application:
$ ./matrixMul
Optionally load cuBLAS and cuFFT:
$ module load cuda12.0/blas/12.0.1
$ module load cuda12.0/fft/12.0.1
View CUDA environment variables:
$ echo $CUDA_HOME
$ echo $PATH
$ echo $LD_LIBRARY_PATH
Unload the modules:
$ module unload cuda12.0/toolkit/12.0.1
$ module unload cuda12.0/blas/12.0.1
$ module unload cuda12.0/fft/12.0.1
Installation
The CUDA 11.8 Toolkit was installed from: https://developer.nvidia.com/cuda-11-8-0-download-archive