NVidia HPC SDK
Overview
NVidia’s HPC SDK. is a comprehensive suite of compilers, libraries and tools for HPC, including the proven compilers, libraries and software tools essential to maximizing developer productivity and the performance and portability of HPC applications.
Documentation: Please refer for details on:
Compilers:
nvc
,nvc++
,nvfortran
andnvcc
.Programming models: C++ parallel algorithms, MPI, OpenACC, OpenMP and CUDA.
Math libraries: cuBLAS, cuTENSOR, cuPARSE, cuSOLVER, cuFFT…
Tools: CUDA-GDB, Nsight
Using
Beartooth: Use the module name
nvhpc
to discover versions available and to load the application.Teton: Use the module name
hpc-sdk
to discover versions available and to load the application.
Compilers: Quick Overview
The SDK provides a number of compilers. Choose depending on your language and programming model:
Compiler | Description | Supports |
---|---|---|
nvc | C11 compiler for NVIDIA GPUs and AMD, Intel, OpenPOWER, and Arm CPUs. It invokes the C compiler, assembler, and linker for the target processors with options derived from its command line arguments. |
|
nvc++ | C++17 compiler for NVIDIA GPUs and Intel CPUs. It invokes the C++ compiler, assembler, and linker for the target processors with options derived from its command line arguments. |
|
nvfortran | Fortran compiler for NVIDIA GPUs and Intel CPUs. It invokes the Fortran compiler, assembler, and linker for the target processors with options derived from its command line arguments. |
|
nvcc | CUDA C and CUDA C++ compiler driver for NVIDIA GPUs. nvcc accepts a range of conventional compiler options, such as for defining macros and include/library paths, and for steering the compilation process. nvcc produces optimized code for NVIDIA GPUs and drives a supported host compiler for AMD, Intel, OpenPOWER, and Arm CPUs. |
|
Getting Started:
The installed SDKs do come with a folder of examples which you can take a copy of. For example, on Beartooth:
[]$ cp -R /apps/u/opt/compilers/nvhpc/22.11/Linux_x86_64/22.11/examples .
Notes:
MPI: The SDK does provide it’s own implement of OpenMPI (i.e. you do not have to separately load an
openmpi
module). Once built, run your code withsrun
to pick up your requested node/task/cpu configuration.When make-ing the various examples, they typically build and run the code. So, in some cases, unless you have created an interactive session with multiple cores (for MPI) or requested a GPU, although the code will build and it will error when run.