Overview
Performance Application Programming Interface (PAPI) provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events.
In addition, PAPI provides access to a collection of components that expose performance measurement opportunities across the hardware and software stack.
PAPI Wiki: https://bitbucket.org/icl/papi/wiki/Home
PAPI MPI Documentation: https://bitbucket.org/icl/papi/wiki/PAPI-Parallel-Programs.md
PAPI is built with a number of components. If a required component is not installed please contact ARCC and request what you need.
Using
Use the module name papi
to discover versions available and to load the application.
To get a list of the components installed with a version of papi
please use the papi_component_avail
command. An example of calling the command is as follows:
Multicore
The papi
library can be used to develop serial, multithreaded, and multinode applications. Please refer to the documentation for more information about using this functionality.
MPI Example
Below is an example of code using both papi
and openmpi
:
An example of compiling the code and running it are:
[@blog2 test]$ module load gcc/12.2.0 papi/6.0.0.1 openmpi/4.1.4 [@blog2 test]$ salloc --account=<account> --time=1:00:00 --nodes=2 --ntasks-per-node=2 [@t514 test]$ mpicc -lpapi mpi_test.c -o mpi_test [@t514 test]$ srun mpi_test n = 50 pi is approximately 3.1416259869230037, Error is 0.0000333333332105 n = 100 pi is approximately 3.1416009869231249, Error is 0.0000083333333318 n = 150 pi is approximately 3.1415963572934968, Error is 0.0000037037037037