I'm using an AMD Ryzen 5 1600 CPU and I would like to use Perf to measure the cache misses of a program. When I run perf stat -e cache-misses./program
perf always reports 0 cache misses. Running perf list
gives the following output:
amd_iommu_0/cmd_processed/ [Kernel PMU event]
amd_iommu_0/cmd_processed_inv/ [Kernel PMU event]
amd_iommu_0/ign_rd_wr_mmio_1ff8h/ [Kernel PMU event]
amd_iommu_0/int_dte_hit/ [Kernel PMU event]
amd_iommu_0/int_dte_mis/ [Kernel PMU event]
amd_iommu_0/mem_dte_hit/ [Kernel PMU event]
amd_iommu_0/mem_dte_mis/ [Kernel PMU event]
amd_iommu_0/mem_iommu_tlb_pde_hit/ [Kernel PMU event]
amd_iommu_0/mem_iommu_tlb_pde_mis/ [Kernel PMU event]
amd_iommu_0/mem_iommu_tlb_pte_hit/ [Kernel PMU event]
amd_iommu_0/mem_iommu_tlb_pte_mis/ [Kernel PMU event]
amd_iommu_0/mem_pass_excl/ [Kernel PMU event]
amd_iommu_0/mem_pass_pretrans/ [Kernel PMU event]
amd_iommu_0/mem_pass_untrans/ [Kernel PMU event]
amd_iommu_0/mem_target_abort/ [Kernel PMU event]
amd_iommu_0/mem_trans_total/ [Kernel PMU event]
amd_iommu_0/page_tbl_read_gst/ [Kernel PMU event]
amd_iommu_0/page_tbl_read_nst/ [Kernel PMU event]
amd_iommu_0/page_tbl_read_tot/ [Kernel PMU event]
amd_iommu_0/smi_blk/ [Kernel PMU event]
amd_iommu_0/smi_recv/ [Kernel PMU event]
amd_iommu_0/tlb_inv/ [Kernel PMU event]
amd_iommu_0/vapic_int_guest/ [Kernel PMU event]
amd_iommu_0/vapic_int_non_guest/ [Kernel PMU event]
branch-instructions OR cpu/branch-instructions/ [Kernel PMU event]
branch-misses OR cpu/branch-misses/ [Kernel PMU event]
cache-misses OR cpu/cache-misses/ [Kernel PMU event]
cache-references OR cpu/cache-references/ [Kernel PMU event]
cpu-cycles OR cpu/cpu-cycles/ [Kernel PMU event]
instructions OR cpu/instructions/ [Kernel PMU event]
msr/aperf/ [Kernel PMU event]
msr/irperf/ [Kernel PMU event]
msr/mperf/ [Kernel PMU event]
msr/tsc/ [Kernel PMU event]
stalled-cycles-backend OR cpu/stalled-cycles-backend/ [Kernel PMU event]
stalled-cycles-frontend OR cpu/stalled-cycles-frontend/ [Kernel PMU event]
rNNN [Raw hardware event descriptor]
cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor]
(see 'man perf-list' on how to encode it)
mem:<addr>[/len][:access] [Hardware breakpoint]
Running sudo perf list
gives a lot more events than the ones above. I'm not sure why cache-misses
isn't working because events like branch-misses
are working. Maybe I have to use one of the amd_iommu_0
events ( amd_iommu_0/mem_dte_mis/
looks promising but I'm not actually sure what it measures)? Is there any reference that explains what these events are?
The CPU flavor AMD Ryzen 5 1600
is based on AMDs Zen
microarchitecture family. A quick lookup of Zen
tells me that the CPUID code associated with this microarchitecture is 17h .
Note that, the event cache-misses
is mapped to the generalized hardware event, PERF_COUNT_HW_CACHE_MISSES which is not readily available on all platforms.
Per the latest linux kernel source (which is 5.3.11) at the time of this writing, it can be seen that the event cache-misses
is not directly supported for CPU family 17h and above.
Note that, to understand most of the performance monitoring counter( PMC ) events for AMD, you need to consult the below reference -
AMD BIOS and Kernel Developer Guide (I could not find it for CPU family 17h)
The other option is instead of passing names for events, you can pass the raw hexadecimal codes for the events, in the format -e rXXXX
where XXXX
is the code. Another answer over here describes how you can obtain this raw hexadecimal code for events like cache-misses
.
You can also look at this commit to get further details as to how cache misses are being represented.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.