Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Overview

  • Performance Application Programming Interface (PAPI) provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events.

    In addition, PAPI provides access to a collection of components that expose performance measurement opportunities across the hardware and software stack.

Using

Use the module name papi to discover versions available and to load the application.

To get a list of the components installed with a version of papi please use the papi_component_avail command. An example of calling the command is as follows:

 Example of using papi_component_avail command
[@blog2 ~]$ papi_component_avail
Available components and hardware information.
--------------------------------------------------------------------------------
PAPI version             : 6.0.0.1
Operating system         : Linux 4.18.0-372.32.1.el8_6.x86_64
Vendor string and code   : GenuineIntel (1, 0x1)
Model string and code    : Intel(R) Xeon(R) Gold 6342 CPU @ 2.80GHz (106, 0x6a)
CPU revision             : 6.000000
CPUID                    : Family/Model/Stepping 6/106/6, 0x06/0x6a/0x06
CPU Max MHz              : 3500
CPU Min MHz              : 800
Total cores              : 96
SMT threads per core     : 2
Cores per socket         : 24
Sockets                  : 2
Cores per NUMA region    : 48
NUMA regions             : 2
Running in a VM          : no
Number Hardware Counters : 19
Max Multiplex Counters   : 384
Fast counter read (rdpmc): yes
--------------------------------------------------------------------------------

Compiled-in components:
Name:   perf_event              Linux perf_event CPU counters
Name:   perf_event_uncore       Linux perf_event CPU uncore and northbridge
   \-> Disabled: No uncore PMUs or events found
Name:   example                 A simple example component
Name:   infiniband              Linux Infiniband statistics using the sysfs interface

Active components:
Name:   perf_event              Linux perf_event CPU counters
                                Native: 154, Preset: 0, Counters: 19
                                PMUs supported: ix86arch, perf, perf_raw, icl

Name:   example                 A simple example component
                                Native: 4, Preset: 0, Counters: 3

Name:   infiniband              Linux Infiniband statistics using the sysfs interface
                                Native: 96, Preset: 0, Counters: 96


--------------------------------------------------------------------------------

Multicore

The papi library can be used to develop both serial and multithreaded and multinode applications. Please refer to the documentation for more information about using this functionality.

MPI Example

Below is an example of code using both papi and openmpi:

 mpi_test.c
/*
 Code based on: https://bitbucket.org/icl/papi/wiki/PAPI-Parallel-Programs
*/
#include <papi.h>
#include "mpi.h"
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
void handle_error (int retval);
int main(int argc, char *argv[] )
{
{
  int done = 0, n, myid, numprocs, i, rc, retval, EventSet = PAPI_NULL;
  double PI25DT = 3.141592653589793238462643;
  double mypi, pi, h, sum, x, a;
  long_long values[1] = {(long_long) 0};
  MPI_Init(&argc,&argv);
  MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
  MPI_Comm_rank(MPI_COMM_WORLD,&myid);

  /* Initialize the PAPI library */
  retval = PAPI_library_init(PAPI_VER_CURRENT);
  if (retval != PAPI_VER_CURRENT) {handle_error(retval);
}
  /* Create an EventSet */
  retval = PAPI_create_eventset(&EventSet);
  if (retval != PAPI_OK){ handle_error(retval);}
  /* Add Total Instructions Executed to our EventSet */
  retval = PAPI_add_event(EventSet, PAPI_TOT_INS);
  if (retval != PAPI_OK){ handle_error(retval);}
  /* Start counting */
  retval = PAPI_start(EventSet);
  if (retval != PAPI_OK){ handle_error(retval);}
  n = 50;
  for(int n = 50; n <= 150; n += 50)
  {
    if (myid == 0) {
    printf("n = %i\n",n);
    }
    MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
        h   = 1.0 / (double) n;
    sum = 0.0;
    for (i = myid + 1; i <= n; i += numprocs) {
        x = h * ((double)i - 0.5);
        sum += 4.0 / (1.0 + x*x);
    }
    mypi = h * sum;
    MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0,MPI_COMM_WORLD);
    if (myid == 0){
        printf("pi is approximately %.16f, Error is %.16f\n", pi, fabs(pi - PI25DT));
    }
  }
   /* Read the counters */
   retval = PAPI_read(EventSet, values);
   if (retval != PAPI_OK) handle_error(retval);
  // printf("After reading counters: %lld\n",values[0]);
   /* Start the counters */
   retval = PAPI_stop(EventSet, values);
   if (retval != PAPI_OK){ handle_error(retval);}
   //printf("After stopping counters: %lld\n",values[0]);
  }
   MPI_Finalize();
   exit(0);
}
void handle_error (int retval)
{
     printf("PAPI error %d: %s\n", retval, PAPI_strerror(retval));
     exit(1);
}

An example of compiling the code and running it are:

[@blog2 test]$ module load gcc/12.2.0 papi/6.0.0.1 openmpi/4.1.4
[@blog2 test]$ salloc --account=<account> --time=1:00:00 --nodes=2 --ntasks-per-node=2
[@t514 test]$ mpicc -lpapi mpi_test.c -o mpi_test
[@t514 test]$ srun mpi_test
n = 50
pi is approximately 3.1416259869230037, Error is 0.0000333333332105
n = 100
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
n = 150
pi is approximately 3.1415963572934968, Error is 0.0000037037037037
  • No labels