cuda12.0

Version:: 12.0.1
Category:: tools
Cluster:: Loki

Author / Distributor

https://developer.nvidia.com/cuda-toolkit-archive

Description

The NVIDIA CUDA® Toolkit 12.0 is the first release in the CUDA 12.x series, introducing next-generation compiler infrastructure and expanded hardware support for modern GPUs, including Hopper (H100).

Available modules:

cuda12.0/toolkit/12.0.1 Core development tools (nvcc, runtime libraries, headers, Nsight profilers)
cuda12.0/blas/12.0.1 cuBLAS 12.0 — updated dense linear algebra library
cuda12.0/fft/12.0.1 cuFFT 11.0 — high-performance Fast Fourier Transform routines

Key features in CUDA 12.0:

Initial support for Hopper architecture (sm_90)
NVVM IR v2 and enhanced Link Time Optimization (LTO)
Improved support for Multi-Process Service (MPS)
Compatibility with newer Linux distros and GCC versions
Library updates: - cuBLAS 12.0 - cuFFT 11.0 - cuSPARSE 12.0
Enhanced interoperability with OpenACC and OpenMP offloading

CUDA 12.0 maintains backward compatibility with previous GPU generations (Ampere, Turing, Volta, Pascal).

Documentation

Toolkit: https://docs.nvidia.com/cuda/archive/12.0/
cuBLAS: https://docs.nvidia.com/cuda/archive/12.0/cublas
cuFFT: https://docs.nvidia.com/cuda/archive/12.0/cufft
Programming Guide: https://docs.nvidia.com/cuda/archive/12.0/cuda-c-programming-guide/

Examples/Usage

Load the CUDA 12.0 toolkit:

$ module load cuda12.0/toolkit/12.0.1

Compile a program targeting Hopper:

$ nvcc -arch=sm_90 matrixMul.cu -o matrixMul

Run your application:

$ ./matrixMul

Optionally load cuBLAS and cuFFT:

$ module load cuda12.0/blas/12.0.1
$ module load cuda12.0/fft/12.0.1

View CUDA environment variables:

$ echo $CUDA_HOME
$ echo $PATH
$ echo $LD_LIBRARY_PATH

Unload the modules:

$ module unload cuda12.0/toolkit/12.0.1
$ module unload cuda12.0/blas/12.0.1
$ module unload cuda12.0/fft/12.0.1

Installation

The CUDA 11.8 Toolkit was installed from: https://developer.nvidia.com/cuda-11-8-0-download-archive