CUDA

Overview

CUDA is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

In GPU-accelerated applications, the sequential part of the workload runs on the CPU – which is optimized for single-threaded performance – while the compute intensive portion of the application runs on thousands of GPU cores in parallel. When using CUDA, developers program in popular languages such as C, C++, Fortran, Python and MATLAB and express parallelism through extensions in the form of a few basic keywords.

The CUDA Toolkit from NVIDIA provides everything you need to develop GPU-accelerated applications. The CUDA Toolkit includes GPU-accelerated libraries, a compiler, development tools and the CUDA runtime.

Using

Use the module name cudato discover versions available and to load the application.

Cuda Versions:

As new versions of cuda are released and compute capabilities of GPUs increase, older cards will stop being supported. As according to: https://forums.developer.nvidia.com/t/nvcc-fatal-unsupported-gpu-architecture-compute-35/247815CUDA 12.x has dropped support for Kepler compute 3.x devices. The minimum supported compute capability is 5.0 in CUDA 12.

For Example: If you trying make-ing some of the NVidia samples, you might see the following error:

[salexan5@blog1 deviceQuery]$ make /apps/u/opt/gcc/12.2.0/cuda/12.1.1//bin/nvcc -ccbin g++ -I../../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o deviceQuery.o -c deviceQuery.cpp nvcc fatal : Unsupported gpu architecture 'compute_35' make: *** [Makefile:324: deviceQuery.o] Error 1

To resolve, remove the 3x related GPU architecture references from the Makefile:

From: SMS ?= 35 37 50 52 60 61 70 75 80 86 90 To : SMS ?= 50 52 60 61 70 75 80 86 90