PMU x86-64 性能计数器未显示在 AWS 下的性能中

Question

I am running a C++ benchmark test for a specific application.我正在为特定应用程序运行 C++ 基准测试。 In this test, I open the performance counter file (__NR_perf_event_open syscall) before the critical section, proceed with the section and then after read the specified metric (instructions, cycles, branches, cachemisses, etc).在此测试中，我在关键部分之前打开性能计数器文件 (__NR_perf_event_open 系统调用)，继续该部分，然后在读取指定的指标（指令、周期、分支、缓存丢失等）之后。

I verified that this needs to run under sudo because the process needs CAP_PERFCOUNT capabilities.我确认这需要在 sudo 下运行，因为该进程需要 CAP_PERFCOUNT 功能。 I also have to verify that /proc/sys/kernel/perf_event_paranoid is set to a number higher than 2, which seems to be always the case with Ubuntu 20.04.3 with kernel 5.11.0 which is the OS I standardized across tests.我还必须验证/proc/sys/kernel/perf_event_paranoid设置为大于 2 的数字，这似乎总是 Ubuntu 20.04.3 和 kernel 5.11.0 的情况，这是我在测试中标准化的操作系统。

This setup works on all my local machines.此设置适用于我所有的本地机器。 On the cloud, however, it works only on some instances as m5zn.6xlarge (Intel Xeon Platinum 8252C).然而，在云端，它仅适用于某些实例，例如 m5zn.6xlarge（英特尔至强铂金 8252C）。 It does not work on others as t3.medium, c3.4xlarge, c5a.8xlarge.它不适用于其他人，如 t3.medium、c3.4xlarge、c5a.8xlarge。

The AMI on all them are the same ami-09e67e426f25ce0d7.它们上的 AMI 都是相同的 ami-09e67e426f25ce0d7。

One easy way to verify this behavior is run the following command:验证此行为的一种简单方法是运行以下命令：

sudo perf stat /bin/sleep 1

On the m5zn box I will see:在 m5zn 框上我会看到：

 Performance counter stats for '/bin/sleep 1':

          0.54 msec task-clock                #    0.001 CPUs utiliz
             1      context-switches          #    0.002 M/sec
             1      cpu-migrations            #    0.002 M/sec
            75      page-faults               #    0.139 M/sec
       2191485      cycles                    #    4.070 GHz
       1292564      instructions              #    0.59  insn per cyc
        258373      branches                  #  479.860 M/sec
         11090      branch-misses             #    4.29% of all branc

   1.000902741 seconds time elapsed

   0.000889000 seconds user
   0.000000000 seconds sys

Perf with valid output Perf 有效 output

While on the other boxes I will see:在其他盒子上我会看到：

 Performance counter stats for '/bin/sleep 1':

          0.62 msec task-clock                #    0.001 CPUs utilized
             2      context-switches          #    0.003 M/sec
             0      cpu-migrations            #    0.000 K/sec
            76      page-faults               #    0.124 M/sec
<not supported>      cycles
<not supported>      instructions
<not supported>      branches
<not supported>      branch-misses

   1.002488031 seconds time elapsed

   0.000930000 seconds user
   0.000000000 seconds sys

Perf with not supported values使用不支持的值执行

My suspicion is that the m5zn.6xlarge is backed by a real instance while the others are shared instances.我怀疑 m5zn.6xlarge 由真实实例支持，而其他实例是共享实例。 is my suspicion correct?我的怀疑是否正确？

What instances I can launch that will provide me with performance counter PMU support?我可以启动哪些实例来为我提供性能计数器 PMU 支持？

Thank you!谢谢！

Answer 1

After some research I found out that because all Amazon AWS instances are virtual instances, none of the guest operating systems can directly access the hardware performance counters (PMC or PMU).经过一些研究，我发现由于所有 Amazon AWS 实例都是虚拟实例，因此来宾操作系统都不能直接访问硬件性能计数器（PMC 或 PMU）。

The guest OS can only read the performance counters through a kernel driver called virtual PMU (vPMU), which is available only for certain Intel Xeon CPUs.来宾操作系统只能通过称为虚拟 PMU (vPMU) 的 kernel 驱动程序读取性能计数器，该驱动程序仅适用于某些英特尔至强 CPU。

Therefore in my attempted list of instances, only the m5zn with an Intel Platinum 8252 has a supported CPU.因此，在我尝试的实例列表中，只有带有 Intel Platinum 8252 的 m5zn 具有受支持的 CPU。

It is easy to check if the guest OS supports vPMU by running通过运行可以很容易地检查来宾操作系统是否支持 vPMU

cat /proc/cpuinfo | grep arch_perfmon

It is also possible to check in the dmesg output right after smpboot:也可以在 smpboot 之后立即检查 dmesg output：

[    0.916264] smpboot: CPU0: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz (family: 0x6, model: 0x55, stepping: 0x4)
[    0.916410] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.

On AWS the rule of thumb is that you will get vPMU only on the largest instances, or instances that take an entire socket.在 AWS 上，经验法则是您只能在最大的实例或占用整个套接字的实例上获得 vPMU。

https://oavdeev.github.io/posts/vpmu_support_z1d/ https://oavdeev.github.io/posts/vpmu_support_z1d/

Currently these instances support vPMU:目前这些实例支持 vPMU：

i3.metal
c5.9xlarge
c5.18xlarge
m4.16xlarge
m5.12xlarge
m5.24xlarge
r5.12xlarge
r5.24xlarge
f1.16xlarge
h1.16xlarge
i3.16xlarge
p2.16xlarge
p3.16xlarge
r4.16xlarge
x1.32xlarge
c5d.9xlarge
c5d.18xlarge
m5d.12xlarge
m5d.24xlarge
r5d.12xlarge
r5d.24xlarge
x1e.32xlarge

PMU x86-64 性能计数器未显示在 AWS 下的性能中

问题描述

1 个解决方案

解决方案1
7 已采纳

PMU x86-64 性能计数器未显示在 AWS 下的性能中

问题描述

1 个解决方案

解决方案1 7 已采纳

解决方案1
7 已采纳