简体   繁体   中英

Run perf with an MPI application

perf is a performance analysis tool which can report hardware and software events. I am trying to run it with an MPI application in order to learn how much time the application spends within each core on data transfers and compute operations.

Normally, I would run my application with

mpirun -np $NUMBER_OF_CORES app_name

And it would spawn on several cores or possibly several nodes. Is it possible to add perf on top? I've tried

perf stat mpirun -np $NUMBER_OF_CORES app_name

But the output for this looks like some sort of aggregate of mpirun. Is there a way to collect perf type data from each core?

Something like:

mpirun -np $NUMBER_OF_CORES ./myscript.sh

might work with myscript.sh containing:

#! /bin/bash
perf stat app_name %*

You should add some parameter to the perf call to produce differently named result files.

perf can follow spawned child processes. To profile the MPI processes located on the same node, you can simply do

perf stat mpiexec -n 2 ./my-mpi-app

You can use perf record as well. It will create a single perf.data file containing the profiling information for all the local MPI processes. However, this won't allow you to profile individual MPI ranks.

To find out information about individual mpi ranks, you need to run

mpiexec -n 2 perf stat ./my-mpi-app

This will profile the individual ranks and will also work across multiple nodes. However, this does not work with some perf commands such as perf record .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM