简体   繁体   English

性能缓存事件是什么意思?

[英]What are perf cache events meaning?

I am trying to figure out why a modified C program is running faster than its non modified counter part (I am adding very few lines of code to perform some additional work).我试图弄清楚为什么修改后的 C 程序运行得比其未修改的对应部分更快(我添加了很少的代码行来执行一些额外的工作)。 In this context, I suspect " cache effects " to be the main explanation (instruction cache).在这种情况下,我怀疑“缓存效果”是主要的解释(指令缓存)。 Thus I reach the perf (https://perf.wiki.kernel.org/index.php/Main_Page) profiling tool but unfortunately I am not able to understand the meaning of its outputs regarding cache misses.因此,我使用了perf (https://perf.wiki.kernel.org/index.php/Main_Page) 分析工具,但不幸的是,我无法理解其关于缓存未命中的输出的含义。

Several events about cache are provided:提供了几个关于缓存的事件:

  cache-references                                   [Hardware event]
  cache-misses                                       [Hardware event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-dcache-store-misses                             [Hardware cache event]
  L1-dcache-prefetches                               [Hardware cache event]
  L1-dcache-prefetch-misses                          [Hardware cache event]
  L1-icache-loads                                    [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  L1-icache-prefetches                               [Hardware cache event]
  L1-icache-prefetch-misses                          [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-prefetches                                     [Hardware cache event]
  LLC-prefetch-misses                                [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-prefetches                                    [Hardware cache event]
  dTLB-prefetch-misses                               [Hardware cache event]
  iTLB-loads                                         [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  node-loads                                         [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-stores                                        [Hardware cache event]
  node-store-misses                                  [Hardware cache event]
  node-prefetches                                    [Hardware cache event]
  node-prefetch-misses                               [Hardware cache event]

Where can I find explanation about these fields ?我在哪里可以找到关于这些字段的解释? cache-misses event is always smaller than other events. cache-misses 事件总是比其他事件小。 What does this event measure ?这个事件衡量什么?

How to interpret the 26,760 L1-icache-load-misses for ls vs the 5,708 cache-misses in the following example ?在以下示例中,如何解释 ls 的 26,760 个 L1-icache-load-misses 与 5,708 个缓存未命中?

perf stat -e L1-icache-load-misses ls
caches  caches~  out

 Performance counter stats for 'ls':

            26,760 L1-icache-load-misses                                       

       0.002816690 seconds time elapsed



perf stat -e cache-misses ls
caches  caches~  out

 Performance counter stats for 'ls':

             5,708 cache-misses                                                

       0.002822122 seconds time elapsed

Some answers:一些答案:

  • L1 is the Level-1 cache, the smallest and fastest one. L1是一级缓存,最小最快。 LLC on the other hand refers to the last level of the cache hierarchy , thus denoting the largest but slowest cache.另一方面, LLC指的是缓存层次结构的最后一级,因此表示最大但最慢的缓存。
  • i vs. d distinguishes instruction cache from data cache. id将指令缓存与数据缓存区分开来。 Only L1 is split in this way, other caches are shared between data and instructions.这样只拆分L1,其他缓存在数据和指令之间共享。
  • TLB refers to the translation lookaside buffer , a cache used when mapping virtual addresses to physical ones. TLB指的是转换后备缓冲区,这是将虚拟地址映射到物理地址时使用的缓存。
  • Different TLB counters depending on whether the named address referred to an instruction or some data.不同的 TLB 计数器取决于命名地址是指指令还是某些数据。
  • For all data access, different counters are kept depending on whether the given memory location was read, written, or prefetched (ie retrieved for reading at some later time).对于所有数据访问,根据给定的内存位置是被读取、写入还是预取(即在稍后的某个时间检索读取)而保留不同的计数器。
  • The number of misses indicates how often a given item of data was accessed but not present in the cache.未命中的数量表示给定的数据项多久被访问,但在高速缓存中存在。

You seem to think that the cache-misses event is the sum of all other kind of cache misses ( L1-dcache-load-misses , and so on).您似乎认为cache-misses事件是所有其他类型的缓存未命中( L1-dcache-load-misses等)的总和。 That is actually not true.这实际上不是真的。

the cache-misses event represents the number of memory access that could not be served by any of the cache. cache-misses事件表示任何缓存都无法提供的内存访问次数。

I admit that perf's documentation is not the best around.我承认 perf 的文档并不是最好的。

However, one can learn quite a lot about it by reading (assuming that you already have a good knowledge of how a CPU and a performance monitoring unit work, this is clearly not a computer architecture course) the doc of the perf_event_open() function:但是,您可以通过阅读 perf_event_open() 函数的文档(假设您已经很好地了解 CPU 和性能监控单元的工作原理,这显然不是计算机体系结构课程)来了解很多关于它的信息:

http://web.eece.maine.edu/~vweaver/projects/perf_events/perf_event_open.html http://web.eece.maine.edu/~vweaver/projects/perf_events/perf_event_open.html

For example, by reading it you can see that the cache-misses event showed by perf list corresponds to PERF_COUNT_HW_CACHE_MISSES例如,通过阅读它可以看到 perf list 显示的cache-misses事件对应于PERF_COUNT_HW_CACHE_MISSES

According to perf tutorial , Performance Monitoring Unit (PMU) events or hardware events refer to those events which can be mapped directly to CPU specific events for a CPU vendor.根据perf 教程性能监控单元 (PMU) 事件硬件事件是指那些可以直接映射到 CPU 供应商的 CPU 特定事件的事件。 But the hardware cache events refer to some hardware events monikers provided by perf , which may be mapped to actual events provided by the CPU.但是硬件缓存事件是指perf提供的一些硬件事件名字,可能映射到 CPU 提供的实际事件。 For the list of perf 's cache events use perf list cache in Linux terminal.对于perf的缓存事件perf list cache在 Linux 终端中使用perf list cache

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM