简体   繁体   English

LBR vs DWARF vs fp 的性能记录选择有什么作用?

[英]What do the perf record choices of LBR vs DWARF vs fp do?

When I use the perf record on my code, I find three choices for the --call-graph option: lbr (last branch record), dwarf and fp .当我使用perf record在我的代码,我找了三个选项--call-graph选项: lbr (最后一个分支记录), dwarffp

What is difference between these?这些有什么区别?

The option --call-graph refers to the collection of call graphs / call chains, ie the function stack for a sample.选项--call-graph指的是调用图/调用链的集合,即样本的函数堆栈。

The default, fp , uses frame pointers .默认值fp使用帧指针 This is very efficient but can be unreliable, particularly for optimized code.这非常有效,但可能不可靠,特别是对于优化代码。 By explicitly using -fno-omit-frame-pointer , you can ensure that this is available for your code.通过显式使用-fno-omit-frame-pointer ,您可以确保这可用于您的代码。 Nevertheless, the result for libraries may vary.然而,图书馆的结果可能会有所不同。

With dwarf , perf actually collects and stores a part of the stack memory itself and unwinds it with post-processing.使用dwarfperf实际上收集和存储堆栈内存本身的一部分,并通过后处理展开它。 This can be very resource consuming and may have limited stack depth.这可能非常消耗资源,并且堆栈深度可能有限。 The default stack memory chunk is 8 kiB, but can be configured.默认堆栈内存块为 8 kiB,但可以配置。

lbr stands for last branch records . lbr代表最后一个分支记录 This is a hardware mechanism support by Intel CPUs.这是 Intel CPU 支持的硬件机制。 This will probably offer the best performance at the cost of portability.这可能会以便携性为代价提供最佳性能。 lbr is also limited to userspace functions. lbr也仅限于用户空间功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM