简体   繁体   中英

perf: strange relation between software events

Okay, so this really bugs me.

I'm using perf to record the cpu-clock event (a software event):

$ > perf record -e cpu-clock srun -n 1 ./stream

... and the table produced by perf report is empty.

I'm using perf to record all available software events listed in perf list:

$ > perf record -e alignment-faults,context-switches,cpu-clock,cpu-migrations,\
dummy,emulation-faults,major-faults,minor-faults,page-faults,task-clock\
srun -n 1 ./stream

... the table gives me a list of available samples:

0 alignment-faults                                   
125 context-switches                                                
255 cpu-clock                                                  
21 cpu-migrations                                                        
0 dummy                                                              
0 emulation-faults                                             
0 major-faults                                                      
128 minor-faults                 
132 page-faults                                                           
254 task-clock 

I can look at the samples collected in cpu-clock and it gives me information. Why?! Why does it not work if I only measure cpu-clock? Why were there no samples collected in four events?

This is a follow-up to this question: error: perf.data file has no samples

Probably srun don't start target process with direct fork. It may use some varian ot remote shell like ssh or daemon to start processes.

perf record (without -a option) will track only directly forked sub-processes, not the process started (forked) by sshd or other daemon. And it will never profile remote machine if the srun can go to it and perf record ... srun command was used (this is to profile srun application and everything it forks).

Try perf stat first to get total (raw) performance counters, and put perf as srun argument; this is the correct usage with tools which uses remote shell or daemons (probably with full path to perf):

 srun -n 1 perf stat ./stream
 srun -n 1 /usr/bin/perf stat ./stream

perf stat will print running time of target task. Then select some event with high raw counter (perf record usually tune sample rate to around several kHz, so thousands of samples will be generated, if there are enough raw event counts):

 srun -n 1 perf record -e cpu-clock ./stream
 srun -n 1 /usr/bin/perf record -e cpu-clock ./stream

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM