简体   繁体   中英

Perf measure cache misses on AMD CPU

I'm using an AMD Ryzen 5 1600 CPU and I would like to use Perf to measure the cache misses of a program. When I run perf stat -e cache-misses./program perf always reports 0 cache misses. Running perf list gives the following output:

  amd_iommu_0/cmd_processed/                         [Kernel PMU event]
  amd_iommu_0/cmd_processed_inv/                     [Kernel PMU event]
  amd_iommu_0/ign_rd_wr_mmio_1ff8h/                  [Kernel PMU event]
  amd_iommu_0/int_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/int_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_pass_excl/                         [Kernel PMU event]
  amd_iommu_0/mem_pass_pretrans/                     [Kernel PMU event]
  amd_iommu_0/mem_pass_untrans/                      [Kernel PMU event]
  amd_iommu_0/mem_target_abort/                      [Kernel PMU event]
  amd_iommu_0/mem_trans_total/                       [Kernel PMU event]
  amd_iommu_0/page_tbl_read_gst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_nst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_tot/                     [Kernel PMU event]
  amd_iommu_0/smi_blk/                               [Kernel PMU event]
  amd_iommu_0/smi_recv/                              [Kernel PMU event]
  amd_iommu_0/tlb_inv/                               [Kernel PMU event]
  amd_iommu_0/vapic_int_guest/                       [Kernel PMU event]
  amd_iommu_0/vapic_int_non_guest/                   [Kernel PMU event]
  branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
  branch-misses OR cpu/branch-misses/                [Kernel PMU event]
  cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
  cache-references OR cpu/cache-references/          [Kernel PMU event]
  cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
  instructions OR cpu/instructions/                  [Kernel PMU event]
  msr/aperf/                                         [Kernel PMU event]
  msr/irperf/                                        [Kernel PMU event]
  msr/mperf/                                         [Kernel PMU event]
  msr/tsc/                                           [Kernel PMU event]
  stalled-cycles-backend OR cpu/stalled-cycles-backend/ [Kernel PMU event]
  stalled-cycles-frontend OR cpu/stalled-cycles-frontend/ [Kernel PMU event]

  rNNN                                               [Raw hardware event descriptor]
  cpu/t1=v1[,t2=v2,t3 ...]/modifier                  [Raw hardware event descriptor]
   (see 'man perf-list' on how to encode it)

  mem:<addr>[/len][:access]                          [Hardware breakpoint]

Running sudo perf list gives a lot more events than the ones above. I'm not sure why cache-misses isn't working because events like branch-misses are working. Maybe I have to use one of the amd_iommu_0 events ( amd_iommu_0/mem_dte_mis/ looks promising but I'm not actually sure what it measures)? Is there any reference that explains what these events are?

The CPU flavor AMD Ryzen 5 1600 is based on AMDs Zen microarchitecture family. A quick lookup of Zen tells me that the CPUID code associated with this microarchitecture is 17h .

Note that, the event cache-misses is mapped to the generalized hardware event, PERF_COUNT_HW_CACHE_MISSES which is not readily available on all platforms.

Per the latest linux kernel source (which is 5.3.11) at the time of this writing, it can be seen that the event cache-misses is not directly supported for CPU family 17h and above.

Note that, to understand most of the performance monitoring counter( PMC ) events for AMD, you need to consult the below reference -

AMD BIOS and Kernel Developer Guide (I could not find it for CPU family 17h)

The other option is instead of passing names for events, you can pass the raw hexadecimal codes for the events, in the format -e rXXXX where XXXX is the code. Another answer over here describes how you can obtain this raw hexadecimal code for events like cache-misses .

You can also look at this commit to get further details as to how cache misses are being represented.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM