[英]Which perf events can use PEBS?
I want to understand which events can have the precise modifier on my CPU (Sandy Bridge). 我想了解哪些事件可以在我的CPU(Sandy Bridge)上具有精确修改器。
Intel Software Developer's Manual (Table 18-32. PEBS Performance Events for Intel Microarchitecture Code Name Sandy Bridge) contains only the following events: INST_RETIRED
, UOPS_RETIRED
, BR_INST_RETIRED
, BR_MISP_RETIRED
, MEM_UOPS_RETIRED
, MEM_LOAD_UOPS_RETIRED
, MEM_LOAD_UOPS_LLC_HIT_RETIRED
. 英特尔软件开发人员手册(表18-32.英特尔微体系结构代码名称Sandy Bridge的PEBS性能事件)仅包含以下事件: INST_RETIRED
, UOPS_RETIRED
, BR_INST_RETIRED
, BR_MISP_RETIRED
, MEM_UOPS_RETIRED
, MEM_LOAD_UOPS_RETIRED
, MEM_LOAD_UOPS_LLC_HIT_RETIRED
。 And SandyBridge_core_V15.json lists the same events with PEBS > 0. SandyBridge_core_V15.json列出了与PEBS> 0相同的事件。
However there are some examples of using perf
, which add :p
to the cycles
event. 但是有一些使用perf
例子 ,它们将:p
添加到cycles
事件中。 And I can successfully run perf record -e cycles:p
on my machine. 我可以在我的机器上成功运行perf record -e cycles:p
。
Also perf record -e cycles:p -vv -- sleep 1
prints precise_ip 1
. 还perf record -e cycles:p -vv -- sleep 1
打印precise_ip 1
。 So does it mean that CPU_CLK_UNHALTED
event actually uses PEBS? 那么它是否意味着CPU_CLK_UNHALTED
事件实际上使用了PEBS?
Is it possible to get the full list of events, which support :p
? 是否有可能获得支持的完整事件列表:p
?
There is hack to support cycles:p
on SandyBridge which has no PEBS for CPU_CLK_UNHALTED.*
. 有支持cycles:p
黑客攻击cycles:p
在SandyBridge上没有PEBS的CPU_CLK_UNHALTED.*
。 The hack is implemented in the kernel part of perf
in intel_pebs_aliases_snb()
. hack在intel_pebs_aliases_snb()
的perf
的内核部分实现。 When user requests -e cycles
which is PERF_COUNT_HW_CPU_CYCLES
(translates to CPU_CLK_UNHALTED.CORE
) with nonzero precise
modifier, this function will change hardware event to UOPS_RETIRED.ALL
with PEBS: 当用户使用非零precise
修改器请求-e cycles
PERF_COUNT_HW_CPU_CYCLES
(转换为CPU_CLK_UNHALTED.CORE
)时,此函数将使用PEBS将硬件事件更改为UOPS_RETIRED.ALL
:
29 [PERF_COUNT_HW_CPU_CYCLES] = 0x003c,
2739 static void intel_pebs_aliases_snb(struct perf_event *event)
2740 {
2741 if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
2742 /*
2743 * Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
2744 * (0x003c) so that we can use it with PEBS.
2745 *
2746 * The regular CPU_CLK_UNHALTED.THREAD_P event (0x003c) isn't
2747 * PEBS capable. However we can use UOPS_RETIRED.ALL
2748 * (0x01c2), which is a PEBS capable event, to get the same
2749 * count.
2750 *
2751 * UOPS_RETIRED.ALL counts the number of cycles that retires
2752 * CNTMASK micro-ops. By setting CNTMASK to a value (16)
2753 * larger than the maximum number of micro-ops that can be
2754 * retired per cycle (4) and then inverting the condition, we
2755 * count all cycles that retire 16 or less micro-ops, which
2756 * is every cycle.
2757 *
2758 * Thereby we gain a PEBS capable cycle counter.
2759 */
2760 u64 alt_config = X86_CONFIG(.event=0xc2, .umask=0x01, .inv=1, .cmask=16);
2761
2762 alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
2763 event->hw.config = alt_config;
2764 }
2765 }
The intel_pebs_aliases_snb
hack is registered in 3557 __init int intel_pmu_init(void)
for case INTEL_FAM6_SANDYBRIDGE:
/ case INTEL_FAM6_SANDYBRIDGE_X:
as 所述intel_pebs_aliases_snb
劈被登记在3557 __init int intel_pmu_init(void)
为case INTEL_FAM6_SANDYBRIDGE:
/ case INTEL_FAM6_SANDYBRIDGE_X:
如
3772 x86_pmu.event_constraints = intel_snb_event_constraints;
3773 x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
3774 x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
pebs_aliases
is called from intel_pmu_hw_config()
when precise_ip
is set to non-zero: pebs_aliases
从称为intel_pmu_hw_config()
时precise_ip
被设置为非零:
2814 static int intel_pmu_hw_config(struct perf_event *event)
2815 {
2821 if (event->attr.precise_ip) {
2828 if (x86_pmu.pebs_aliases)
2829 x86_pmu.pebs_aliases(event);
2830 }
The hack was implemented in 2012, lkml threads "[PATCH] perf, x86: Make cycles:p working on SNB", "[tip:perf/core] perf/x86: Implement cycles:p for SNB/IVB", cccb9ba9e4ee0d750265f53de9258df69655c40b, http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=cccb9ba9e4ee0d750265f53de9258df69655c40b : 黑客攻击是在2012年实施的,lkml线程“[PATCH] perf,x86:使周期:p工作于SNB”,“[tip:perf / core] perf / x86:实现周期:p表示SNB / IVB”,cccb9ba9e4ee0d750265f53de9258df69655c40b, http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=cccb9ba9e4ee0d750265f53de9258df69655c40b :
perf/x86: Implement cycles:p for SNB/IVB perf / x86:实现周期:p表示SNB / IVB
Now that there's finally a chip with working PEBS (IvyBridge), we can enable the hardware and implement cycles:p for SNB/IVB. 现在最终有一个带有工作PEBS(IvyBridge)的芯片,我们可以启用硬件并实现周期:p用于SNB / IVB。
And I think, there is no full list of such "precise" converting hack besides the linux source code in arch/x86/events/intel/core.c
, grep for static void intel_pebs_aliases
(usually cycles:p
/ CPU_CLK_UNHALTED 0x003c
is implemented) and check intel_pmu_init
for actual model and exact x86_pmu.pebs_aliases
variant selected: 而且我认为除了arch/x86/events/intel/core.c
的linux源代码之外,还没有完整的“精确”转换黑客列表,grep for static void intel_pebs_aliases
(通常是cycles:p
/ CPU_CLK_UNHALTED 0x003c
已实现)并检查intel_pmu_init
的实际模型和选择的确切x86_pmu.pebs_aliases
变体:
INST_RETIRED.ANY_P (0x00c0) CNTMASK=16
instead of cycles:p
intel_pebs_aliases_core2, INST_RETIRED.ANY_P (0x00c0) CNTMASK=16
而不是cycles:p
UOPS_RETIRED.ALL (0x01c2) CNTMASK=16
instead of cycles:p
intel_pebs_aliases_snb, UOPS_RETIRED.ALL (0x01c2) CNTMASK=16
而不是cycles:p
precise_ip
, INST_RETIRED.PREC_DIST (0x01c0)
instead of cycles:ppp
on SKL, IVB, HSW, BDW intel_pebs_aliases_precdist表示最高的precise_ip
, INST_RETIRED.PREC_DIST (0x01c0)
而不是cycles:ppp
SKL,IVB,HSW,BDW上的cycles:ppp
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.