IPM Hompepage
Overview | Profiling vs. Tracing | Installation | Using IPM | Implementation | References
 


Using IPM

Profiling your code with IPM

Note: If you are working on NSF teragrid machines, please consalt this NSF quickstart guide in addition to the information below.

IPM can be using in one of two modes either statically or dynamically:

Static Usage - in this case the users code needs to be relinked:
mpicc my_mpi_code.c -L/path/to/ipm/lib -lipm
Dynamic Usage - no code recompilation needed:

csh syntax
setenv LD_PRELOAD /path/to/ipm/lib/libipm.so
mpirun ./a.out
unsetenv LD_PRELOAD
bash syntax
LD_PRELOAD=/path/to/ipm/lib/libipm.so mpirun ./a.out
IPM is controlled via environment variables and through MPI_Pcontrol.

Environment Variables

Variable Values Description
IPM_REPORT terse (default) Aggregate wallclock time, memory usage and flops are reported along with the percentage of wallclock time spent in MPI calls.
  full Each HPM counter is reported as are all of wallclock, user, system, and MPI time. The contribution of each MPI call to the communication time is given.
  none No report
IPM_MPI_THRESHOLD 0.0 < x < 1.0 Only report MPI routines using more than x% of the total MPI time.
IPM_HPM 1,2,3,4,scan POWER3 allows four different event sets. Use this environment variable to pick the event set or select scan to use different event sets on different tasks. Using the scan option allows greater coverage of the HPM counters but for codes with load imbalance or MPMD models uniform sampling may be more accurate. The scan option extrapolates the to full totals based on the sampled event sets.

MPI_Pcontrol

The first argument to MPI_Pcontrol determines what action will be taken by IPM.

Arguments Description
1,"label" start code region "label"
-1,"label" exit code region "label"
0,"label" invoke custom event "label"

Code Regions

Defining code regions and events:

       C                                     FORTRAN
MPI_Pcontrol( 1,"proc_a");           call mpi_pcontrol( 1,"proc_a"//char(0))
MPI_Pcontrol(-1,"proc_a");           call mpi_pcontrol(-1,"proc_a"//char(0))
MPI_Pcontrol( 0,"tag_a");            call mpi_pcontrol(0,"tag_a"//char(0))
MPI_Pcontrol( 0,"tag_a");            call mpi_pcontrol(0,"tag_a"//char(0))

                                     ( fortran label strings must be null terminated )


Post-processing IPM output

By default IPM produces a summary of the performance information for the application on stdout. IPM also generates an XML file that can be used to generate a graphical webpage. This can be produced one of two ways:
  1. Generation of the webpage on the cluster where IPM ran and then ftp the html to a local site
    • build ploticus for the cluster head node. It is available here http://ploticus.sourceforge.net
    • setenv IPM_KEYFILE /path/to/ipm/ipm_key
    • /path/to/ipm/bin/ipm_parse -html xmlfile
    • This will generate a directory named something like
       
       a.out_1_nwright.1231369287.321103.0_ipm_unknown
      
      tar up that dir, ftp it to your laptop, untar and look at index.html.
  2. Move the IPM xml file locally and generate the html on your laptop/desktop.
  3. The IPM xml file will be named something like
    your_username.1231369287.321103.0 eg. nwright.1231369287.321103.0
    
    • install ploticus on your local machine http://ploticus.sourceforge.net - (note you can do this under cygwin on windows.)
    • put a copy of IPM your local machine - no need to compile you just need access to the ipm_parse script and keyfile
    • setenv IPM_KEYFILE /path/to/ipm/ipm_key
    • /path/to/ipm/bin/ipm_parse -html
    • you will be left with a directory with a index.html file you can open with your favorite browser.

Using Hardware Performance Counters with IPM

IPM provides a method of collecting data from hardware performance counters, using either the PAPI (or on AIX systems PMAPI interface). Within IPM several default are defined for each type of processor. These are listed below and are accessed by setting the IPM_HPM environment variable to the number corresponding to the desired group. In addition the user can also choose their own set of counters, by setting IPM_HPM to a comma separated list of the desired measurements.
setenv IPM_HPM PAPI_FP_OPS,PAPI_TOT_INS,PAPI_L1_DCM,PAPI_L1_DCA
In this case the user is responsible for choosing a valid group of counters. (As defined by papi_avail for example.)

The detailed hardware performance counter settings for various platforms can be found in this file.

Last changed: Fri, 04 Sep 2009 17:28:45 +0000 on shell-21011 by fuerling
Website maintained by: The IPM developers. To get help send email to: