简体   繁体   English

什么被认为是一个很好的缓存命中/未命中率?

[英]What is concidered a good cache hit/miss ratio?

I am running ocount on our program to count L2 cache read events, and we have these result: 我在我们的程序上运行ocount以计算L2缓存读取事件,我们得到以下结果:

Event                               Count                    % time    
counted
l2_rqsts:all_demand_data_rd         14,418,959,276           80.01
l2_rqsts:demand_data_rd_hit         6,297,000,387            80.00
l2_rqsts:demand_data_rd_miss        6,104,577,343            80.00
l2_rqsts:l2_pf_hit                  667,709,870              80.01
l2_rqsts:l2_pf_miss                 1,641,991,158            79.99

However we have no idea if these results should be considered as total cache trashing or not. 但是,我们不知道这些结果是否应被视为总缓存丢弃与否。

What do you consider a good ratio hit/miss ration for L2 cache? 您认为L2缓存的良好比率命中/未命中率是多少?

I expect it highly depends on the CPU architecture and the application requirements but is there a general admissible value for it? 我希望它在很大程度上取决于CPU架构和应用程序要求,但是它有一个普遍允许的值吗?

It depends on the application. 这取决于应用程序。 At the extremes: 极端情况:

  • If every memory access is to the same location, or strided and fits within the cache level of interest (say 256KB total size for a typical L2 cache) without any evictions due to associativity conflicts, the app can approach a 100% hit rate. 如果每个内存访问都是在相同的位置,或跨越并适合感兴趣的缓存级别(例如典型的L2缓存的总大小为256KB)而没有因关联性冲突而导致的任何驱逐,则应用程序可以达到100%的命中率。
  • If memory accesses happen in a region much larger than the cache and are truly random, you could probably end up well under 50% hit rate (I'm not sure of an analytic way to arrive at an exact number but I would guess it would depend on the probability distribution of hitting a given line). 如果内存访问发生在比缓存大得多的区域并且是真正随机的,那么你很可能最终达到50%的命中率(我不确定是否有一种分析方法来得到一个确切的数字,但我猜它会取决于击中给定线的概率分布)。
  • You could intentionally construct a pathological case where your app alternates memory accesses to two different memory locations that happen to collide on the same cache line with whatever way your processor happens to handle associativity. 您可以故意构建一个病态案例,其中您的应用程序将内存访问交替到两个不同的内存位置,这两个内存位置碰巧碰撞在同一个缓存行上,无论处理器处理关联性的方式如何。 In this case the hit rate would approach 0%. 在这种情况下,命中率将接近0%。

I doubt there's any work on an analytic model to predict what kinds of values you might see for a more realistic workload, but there have definitely been some profiles run on common benchmarks. 我怀疑在分析模型上有任何工作来预测您可能会看到哪种类型的值以获得更实际的工作负载,但肯定有一些配置文件在公共基准测试上运行。 For example: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.152.3943&rep=rep1&type=pdf . 例如: http//citeseerx.ist.psu.edu/viewdoc/download?doi = 10.1.1.152.3943 &rep =rep1&type = pdf These folks show a rate of between 20 and 50 misses per thousand instructions (MPKI) on the mcf workload from SPECcpu2000. 这些人在SPECcpu2000的mcf工作负载上显示了每千条指令(MPKI)20到50次未命中率。 Heres's a description of that workload: https://www.spec.org/cpu2000/CINT2000/181.mcf/docs/181.mcf.html . Heres是对该工作量的描述: https//www.spec.org/cpu2000/CINT2000/181.mcf/docs/181.mcf.html It may or may not look to the memory subsystem like what you're interested in optimizing. 它可能会或可能不会像您想要优化的内存子系统一样。

Back to the point of why you might be asking the question in the first place: if other profiling data shows that you're more bound on cache or memory accesses than arithmetic, locking, etc., then you might pick some heuristic value where if you're under, say, an 80 or 95% hit rate, then it might be worth trying to optimize cache access. 回到你可能首先提出这个问题的原因:如果其他分析数据显示你比算术,锁定等更多地限制缓存或内存访问,那么你可能会选择一些启发式值,如果例如,你的命中率为80%或95%,那么尝试优化缓存访问可能是值得的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM