perf
is a performance analysis tool which can report hardware and software events. I am trying to run it with an MPI application in order to learn how much time the application spends within each core on data transfers and compute operations.
Normally, I would run my application with
mpirun -np $NUMBER_OF_CORES app_name
And it would spawn on several cores or possibly several nodes. Is it possible to add perf
on top? I've tried
perf stat mpirun -np $NUMBER_OF_CORES app_name
But the output for this looks like some sort of aggregate of mpirun. Is there a way to collect perf type data from each core?
Something like:
mpirun -np $NUMBER_OF_CORES ./myscript.sh
might work with myscript.sh containing:
#! /bin/bash
perf stat app_name %*
You should add some parameter to the perf call to produce differently named result files.
perf
can follow spawned child processes. To profile the MPI processes located on the same node, you can simply do
perf stat mpiexec -n 2 ./my-mpi-app
You can use perf record
as well. It will create a single perf.data
file containing the profiling information for all the local MPI processes. However, this won't allow you to profile individual MPI ranks.
To find out information about individual mpi ranks, you need to run
mpiexec -n 2 perf stat ./my-mpi-app
This will profile the individual ranks and will also work across multiple nodes. However, this does not work with some perf
commands such as perf record
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.