cuda11.4

Version:

11.4.2

Category:

tools

Cluster:

Loki

Author / Distributor

https://developer.nvidia.com/cuda-toolkit-archive

Description

The NVIDIA CUDA® Toolkit 11.4 delivers tools, libraries, and development utilities for GPU-accelerated computing. It continues support for Ampere, Turing, Volta, and Pascal architectures with optimizations across the toolchain.

Available modules:

  • cuda11.4/toolkit/11.4.2 Core development toolkit including nvcc, device/runtime drivers, Nsight tools, and headers

  • cuda11.4/blas/11.4.2 cuBLAS 11.5 — highly optimized GPU-accelerated linear algebra routines

  • cuda11.4/fft/11.4.2 cuFFT — Fast Fourier Transform library for GPU workloads

Highlights in CUDA 11.4:

  • Enhanced support for A100 GPUs and multi-instance GPU (MIG) features

  • Updates to cuBLAS 11.5, cuSPARSE, cuFFT, and other libraries

  • Improved debugging/profiling via Nsight Systems and Nsight Compute

  • Expanded CUDA Graph features

  • Faster device memory initialization via async cudaMemcpy

Documentation

Examples/Usage

  • Load the CUDA 11.4 toolkit module:

$ module load cuda11.4/toolkit/11.4.2
  • Compile a CUDA application targeting Ampere:

$ nvcc -arch=sm_80 app.cu -o app
  • Run the compiled program:

$ ./app
  • Load cuBLAS or cuFFT separately if needed:

$ module load cuda11.4/blas/11.4.2
$ module load cuda11.4/fft/11.4.2
  • Inspect your environment:

$ echo $CUDA_HOME
$ echo $PATH
$ echo $LD_LIBRARY_PATH
  • Unload modules:

$ module unload cuda11.4/toolkit/11.4.2
$ module unload cuda11.4/blas/11.4.2
$ module unload cuda11.4/fft/11.4.2

Installation

The CUDA 11.4 Toolkit was installed from: https://developer.nvidia.com/cuda-11.4-download-archive