[英]run `perf stat` on the output of `perf record`?
With perf
(the Linux profiler) , (v4.15.18), I can run perf stat $COMMAND
to get some simple stats on the command.使用
perf
(Linux 分析器) ,(v4.15.18),我可以运行perf stat $COMMAND
以获取有关命令的一些简单统计信息。 If I run perf record
, it saves lots of data to a perf.data
file.如果我运行
perf record
,它会将大量数据保存到perf.data
文件中。
Can I run perf stat
on the output of perf record
?我可以在
perf record
的 output 上运行perf stat
吗? So that I can look at the perf recorded data, but also get a simple overview?这样我就可以查看 perf 记录的数据,还可以得到一个简单的概述?
perf stat
uses hardware performance monitoring unit in counting mode, and perf record
/ perf report
with perf.data file uses the same unit in overflow mode. perf stat
在counting模式下使用硬件性能监控单元, perf record
/ perf report
与perf.data文件在overflow模式下使用相同的单元。 In both modes hardware performance counters are configured with control register into some kind of performance events (for example cpu cycles or instructions executed), and counters will be incremented on every event.在这两种模式下,硬件性能计数器都通过控制寄存器配置到某种性能事件(例如 cpu 周期或执行的指令)中,并且计数器将在每个事件上递增。
In counting mode perf stat
will configure counters as zero at program start, and will read final counter value at program exit (actually counting may be split in several segments with same result - single value for full run).在计数模式下
perf stat
将在程序启动时将计数器配置为零,并在程序退出时读取最终的计数器值(实际上计数可能会分成几个具有相同结果的段 - 完整运行的单个值)。
In profiling mode ( sampling profiling ) perf record
will configure counter to some negative value, for example -100000
and overflow handler will be installed (actual value will be autotuned into some frequency).在分析模式( 采样分析)中,
perf record
会将计数器配置为某个负值,例如-100000
并安装溢出处理程序(实际值将自动调整为某个频率)。 Every 100000 events the counter will overflow into zero and generate an interrupt.每发生 100000 个事件,计数器就会溢出到零并产生一个中断。
perf_events
interrupt handler will record the "sample" (current time, pid, instruction pointer, optionally callstack in -g
) into ring buffer which will be saved into perf.data
. perf_events
中断处理程序会将“样本”(当前时间、pid、指令指针、可选的-g
中的调用堆栈)记录到环形缓冲区中,该缓冲区将保存到perf.data
中。 This handler will also reset the counter into -100000
again.此处理程序还将计数器再次重置为
-100000
。 So, after long enough run there will be thousands of samples to be stored in perf.data
, which can be used to generate statistical profile of program (which parts of program did run more often).因此,在运行足够长的时间后,将有数千个样本存储在
perf.data
中,可用于生成程序的统计配置文件(程序的哪些部分运行得更频繁)。
What does perf stat
show? perf stat
显示什么? In default mode for x86_64 cpu: running time of the program (task-clock and elapsed), 3 software events (context switch, cpu migration, page fault), 4 hardware counters: cycles, instructions, branches, branch-misses:在 x86_64 cpu 的默认模式下:程序的运行时间(任务时钟和经过时间)、3 个软件事件(上下文切换、cpu 迁移、页面错误)、4 个硬件计数器:周期、指令、分支、分支未命中:
$ echo '3^123456%3' | perf stat bc
0
Performance counter stats for 'bc':
325.604672 task-clock (msec) # 0.998 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
181 page-faults # 0.556 K/sec
828,234,675 cycles # 2.544 GHz
1,840,146,399 instructions # 2.22 insn per cycle
348,965,282 branches # 1071.745 M/sec
15,385,371 branch-misses # 4.41% of all branches
0.326152702 seconds time elapsed
What does record perf record
?记录
perf record
什么? In single wake up event (ring buffer overflow) it did save 1246 samples into perf.data, and default hw event was used (cycles)在单个唤醒事件(环形缓冲区溢出)中,它确实将 1246 个样本保存到 perf.data 中,并使用了默认硬件事件(周期)
$ echo '3^123456%3' | perf record bc
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.049 MB perf.data (1293 samples) ]
With perf report --header|less
, perf script
and perf script -D
you can take a look into the perf.data content:使用
perf report --header|less
, perf script
和perf script -D
您可以查看 perf.data 内容:
$ perf report --header |grep event
# event : name = cycles:uppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD ...
# Samples: 1K of event 'cycles:uppp'
$ perf script 2>/dev/null |grep cycles|wc -l
1293
There are some timestamps inside perf.data and some additional events for program start and exit ( perf script -D |egrep exec\|EXIT
), but there is no enough information in default perf.data
to fully reconstruct perf stat
output. perf.data 中有一些时间戳和一些用于程序启动和退出的附加事件(
perf script -D |egrep exec\|EXIT
),但默认perf.data
中没有足够的信息来完全重建perf stat
output。 Running time is recorded only as timestamps of start and exit, and of every event sample, software events are not recorded, only single hardware event was used (cycles; no instructions, branches, branch-misses).运行时间仅记录为开始和退出的时间戳,并且在每个事件样本中,不记录软件事件,仅使用单个硬件事件(周期;无指令、分支、分支未命中)。 Approximation of used hardware counter can be done, but it is not exact (real cycles was around 820-825 mln):
可以使用硬件计数器的近似值,但并不准确(实际周期约为 820-8.25 亿):
$ perf report --header |grep Event
# Event count (approx.): 836622729
With non-default recording of perf.data
more events can be estimated:通过
perf.data
的非默认记录,可以估计更多事件:
$ echo '3^123456%3' | perf record -e cycles,instructions,branches,branch-misses bc
[ perf record: Captured and wrote 0.238 MB perf.data (5164 samples) ]
$ perf report --header |egrep Event\|Samples
# Samples: 1K of event 'cycles'
# Event count (approx.): 834809036
# Samples: 1K of event 'instructions'
# Event count (approx.): 1834083643
# Samples: 1K of event 'branches'
# Event count (approx.): 347750459
# Samples: 1K of event 'branch-misses'
# Event count (approx.): 15382047
So, you can't run perf stat
on perf.data
file , but you can ask perf report
to print the header with event count estimation.因此,您不能在
perf.data
文件上运行perf stat
,但您可以要求perf report
打印带有事件计数估计的 header。 You also can try to parse timestamps from perf script
/ perf script -D
.您还可以尝试从
perf script
/ perf script -D
解析时间戳。
No you can't.不,你不能。 perf record output is a data file.
perf 记录 output 是一个数据文件。 perf stat expects an application.
perf stat 需要一个应用程序。 You can use perf script to run a pre-canned scripts that aggregate and summarize the trace data.
您可以使用 perf 脚本来运行聚合和汇总跟踪数据的预制脚本。 Possible scripts can be listed using following command.
可以使用以下命令列出可能的脚本。
perf script -l性能脚本 -l
Beside limited number of pre-canned script, You can also define custom perf.data processing scripts in python or perl.除了有限数量的预装脚本外,您还可以在 python 或 perl 中定义自定义 perf.data 处理脚本。
See perf script , perf script in python and perf script in perl for details.有关详细信息,请参阅perf 脚本、 python 中的perf 脚本和 perl 中的 perf 脚本。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.