简体   繁体   English

在“性能记录”的 output 上运行“性能统计”?

[英]run `perf stat` on the output of `perf record`?

With perf (the Linux profiler) , (v4.15.18), I can run perf stat $COMMAND to get some simple stats on the command.使用perf (Linux 分析器) ,(v4.15.18),我可以运行perf stat $COMMAND以获取有关命令的一些简单统计信息。 If I run perf record , it saves lots of data to a perf.data file.如果我运行perf record ,它会将大量数据保存到perf.data文件中。

Can I run perf stat on the output of perf record ?我可以在perf record的 output 上运行perf stat吗? So that I can look at the perf recorded data, but also get a simple overview?这样我就可以查看 perf 记录的数据,还可以得到一个简单的概述?

perf stat uses hardware performance monitoring unit in counting mode, and perf record / perf report with perf.data file uses the same unit in overflow mode. perf stat在counting模式下使用硬件性能监控单元perf record / perf report与perf.data文件在overflow模式下使用相同的单元。 In both modes hardware performance counters are configured with control register into some kind of performance events (for example cpu cycles or instructions executed), and counters will be incremented on every event.在这两种模式下,硬件性能计数器都通过控制寄存器配置到某种性能事件(例如 cpu 周期或执行的指令)中,并且计数器将在每个事件上递增。

In counting mode perf stat will configure counters as zero at program start, and will read final counter value at program exit (actually counting may be split in several segments with same result - single value for full run).在计数模式下perf stat将在程序启动时将计数器配置为零,并在程序退出时读取最终的计数器值(实际上计数可能会分成几个具有相同结果的段 - 完整运行的单个值)。

In profiling mode ( sampling profiling ) perf record will configure counter to some negative value, for example -100000 and overflow handler will be installed (actual value will be autotuned into some frequency).在分析模式( 采样分析)中, perf record会将计数器配置为某个负值,例如-100000并安装溢出处理程序(实际值将自动调整为某个频率)。 Every 100000 events the counter will overflow into zero and generate an interrupt.每发生 100000 个事件,计数器就会溢出到零并产生一个中断。 perf_events interrupt handler will record the "sample" (current time, pid, instruction pointer, optionally callstack in -g ) into ring buffer which will be saved into perf.data . perf_events中断处理程序会将“样本”(当前时间、pid、指令指针、可选的-g中的调用堆栈)记录到环形缓冲区中,该缓冲区将保存到perf.data中。 This handler will also reset the counter into -100000 again.此处理程序还将计数器再次重置为-100000 So, after long enough run there will be thousands of samples to be stored in perf.data , which can be used to generate statistical profile of program (which parts of program did run more often).因此,在运行足够长的时间后,将有数千个样本存储在perf.data中,可用于生成程序的统计配置文件(程序的哪些部分运行得更频繁)。

What does perf stat show? perf stat显示什么? In default mode for x86_64 cpu: running time of the program (task-clock and elapsed), 3 software events (context switch, cpu migration, page fault), 4 hardware counters: cycles, instructions, branches, branch-misses:在 x86_64 cpu 的默认模式下:程序的运行时间(任务时钟和经过时间)、3 个软件事件(上下文切换、cpu 迁移、页面错误)、4 个硬件计数器:周期、指令、分支、分支未命中:

$ echo '3^123456%3' | perf stat bc
0
 Performance counter stats for 'bc':
        325.604672      task-clock (msec)         #    0.998 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               181      page-faults               #    0.556 K/sec                  
       828,234,675      cycles                    #    2.544 GHz                    
     1,840,146,399      instructions              #    2.22  insn per cycle         
       348,965,282      branches                  # 1071.745 M/sec                  
        15,385,371      branch-misses             #    4.41% of all branches        
       0.326152702 seconds time elapsed

What does record perf record ?记录perf record什么? In single wake up event (ring buffer overflow) it did save 1246 samples into perf.data, and default hw event was used (cycles)在单个唤醒事件(环形缓冲区溢出)中,它确实将 1246 个样本保存到 perf.data 中,并使用了默认硬件事件(周期)

$ echo '3^123456%3' | perf record bc
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.049 MB perf.data (1293 samples) ]

With perf report --header|less , perf script and perf script -D you can take a look into the perf.data content:使用perf report --header|lessperf scriptperf script -D您可以查看 perf.data 内容:

$ perf report --header |grep event
# event : name = cycles:uppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD ...
# Samples: 1K of event 'cycles:uppp'
$ perf script 2>/dev/null |grep cycles|wc -l 
1293

There are some timestamps inside perf.data and some additional events for program start and exit ( perf script -D |egrep exec\|EXIT ), but there is no enough information in default perf.data to fully reconstruct perf stat output. perf.data 中有一些时间戳和一些用于程序启动和退出的附加事件( perf script -D |egrep exec\|EXIT ),但默认perf.data中没有足够的信息来完全重建perf stat output。 Running time is recorded only as timestamps of start and exit, and of every event sample, software events are not recorded, only single hardware event was used (cycles; no instructions, branches, branch-misses).运行时间仅记录为开始和退出的时间戳,并且在每个事件样本中,不记录软件事件,仅使用单个硬件事件(周期;无指令、分支、分支未命中)。 Approximation of used hardware counter can be done, but it is not exact (real cycles was around 820-825 mln):可以使用硬件计数器的近似值,但并不准确(实际周期约为 820-8.25 亿):

$ perf report --header |grep Event
# Event count (approx.): 836622729

With non-default recording of perf.data more events can be estimated:通过perf.data的非默认记录,可以估计更多事件:

$ echo '3^123456%3' | perf record -e cycles,instructions,branches,branch-misses bc
[ perf record: Captured and wrote 0.238 MB perf.data (5164 samples) ]
$ perf report --header |egrep Event\|Samples
# Samples: 1K of event 'cycles'
# Event count (approx.): 834809036
# Samples: 1K of event 'instructions'
# Event count (approx.): 1834083643
# Samples: 1K of event 'branches'
# Event count (approx.): 347750459
# Samples: 1K of event 'branch-misses'
# Event count (approx.): 15382047

So, you can't run perf stat on perf.data file , but you can ask perf report to print the header with event count estimation.因此,您不能在perf.data文件上运行perf stat ,但您可以要求perf report打印带有事件计数估计的 header。 You also can try to parse timestamps from perf script / perf script -D .您还可以尝试从perf script / perf script -D解析时间戳。

No you can't.不,你不能。 perf record output is a data file. perf 记录 output 是一个数据文件。 perf stat expects an application. perf stat 需要一个应用程序。 You can use perf script to run a pre-canned scripts that aggregate and summarize the trace data.您可以使用 perf 脚本来运行聚合和汇总跟踪数据的预制脚本。 Possible scripts can be listed using following command.可以使用以下命令列出可能的脚本。
perf script -l性能脚本 -l
Beside limited number of pre-canned script, You can also define custom perf.data processing scripts in python or perl.除了有限数量的预装脚本外,您还可以在 python 或 perl 中定义自定义 perf.data 处理脚本。
See perf script , perf script in python and perf script in perl for details.有关详细信息,请参阅perf 脚本python 中perf 脚本和 perl 中的 perf 脚本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM