简体   繁体   English

使用C ++查找L1和L2缓存的大小

[英]Find the size of L1 and L2 cache with C++

I need to find the size of L1 and L2 cache for an assignment using a c++ simple program in a Windows operating system. 我需要使用Windows操作系统中的c ++简单程序查找L1和L2缓存的大小以进行分配。 I was able to find the size of the L3 cache in 2 different computers by calculating the time it takes to access the elements in an array in increasing sizes. 通过计算以递增的方式访问数组中的元素所花费的时间,我能够在两台不同的计算机中找到L3缓存的大小。 When the jump in time is big, we go from the cache level to the ram level. 当时间跳跃很大时,我们从缓存级别转到内存级别。

How do I figure out the L1 and L2 sizes from here? 如何从这里算出L1和L2尺寸?

The restriction is that I cannot read config or use built in functions to determine the values. 限制是我无法读取config或使用内置函数来确定值。 I must time read/write operations instead. 我必须改为计时读/写操作。

I need to find the size of L1 and L2 cache for an assignment using ac/c++ simple program. 我需要使用ac / c ++简单程序查找L1和L2缓存的大小以进行分配。

In general, you cannot (in theory). 通常, 您不能 (理论上)。

Since a standard conforming C11 implementation (read n1570 ) don't even need to run on a real computer with caches. 由于符合标准的C11实现(读取为n1570 )甚至不需要在具有缓存的真实计算机上运行。 And likewise for C++11 or C++14 (read n3337 ). 同样适用于C ++ 11或C ++ 14(阅读n3337 )。

It could run: 它可以运行:

  • with people (using a bunch of slaves to run a C program would be unethical, inefficient, slow, but is possible; using for half an hour students in a classroom is an entertaining way of teaching C or C++ - the entire class becomes a C or C++ implementation). 与人在一起(使用一堆奴隶来运行C程序是不道德的,效率低下的,缓慢的,但是可能的;在教室里使用半小时的学生是教C或C ++的一种有趣的方式-整个课堂都变成了C或C ++实现)。

  • on a computer without any cache. 在没有任何缓存的计算机上。 Today, microcontrollers (like Arduino ) can be programmed in C or C++ (and routinely are) and don't have any cache. 今天,微控制器(如Arduino )可以用C或C ++编程(并且通常是),并且没有任何缓存。

  • on your favorite x86-64 laptop. 在您最喜欢的x86-64笔记本电脑上。 You'll better read more about the way Intel and AMD processors deal with cache. 您将更好地了解有关Intel和AMD处理器处理缓存的方式的更多信息。

  • on something else (eg some Power9 motherboard , some Intel Edison , some Raspberry Pi , some emulator in your browser , etc...). 在其他设备上(例如某些Power9主板 ,某些Intel Edison ,一些Raspberry Pi您的浏览器中的模拟器等等)。 You could have surprises! 你可能会有惊喜!

How do I figure out the L1 and L2 sizes from here? 如何从这里算出L1和L2尺寸?

You could look at the generated assembler code (with GCC compile using g++ -O1 -fverbose-asm -S then look into the generated .s file), imagine what kind of processor and ISA you have, and make some educated guess (from the measured timings). 您可以查看生成的汇编代码(使用g++ -O1 -fverbose-asm -S进行GCC编译,然后查看生成的.s文件),想象一下您拥有哪种处理器和ISA,并做出一些有根据的猜测(从计时)。 Avoid asking too strong optimizations (since your program probably has undefined behavior ). 避免要求过于强大的优化 (因为您的程序可能具有未定义的行为 )。

On many OSes, you might use operating system specific API to query about the processor. 在许多操作系统上,您可能使用特定于操作系统的API来查询有关处理器的信息。 On Linux, you could use proc(5) and use /proc/cpuinfo 在Linux上,您可以使用proc(5)并使用/proc/cpuinfo

If you did run your program, benchmark it several times, and gave the timing, you could make an educated guess about cache sizes (but you need to assume that your process has not been scheduled out too often; on a very loaded system, that is not the case; you should avoid thrashing ). 如果您确实运行了程序,对其进行了多次基准测试并给出了时间安排,则可以对缓存大小进行有根据的猜测(但是您需要假设您的进程没有被安排得太频繁;在负载非常大的系统上,并非如此;您应该避免颠簸 )。

BTW I guess that under the as-if rule your program might be optimized to something you did not think about (see this ). 顺便说一句,我猜想在常规规则下, 您的程序可能已优化为您没有想到的东西(请参阅参考资料 )。 Notice that using cells inside a new allocated array without initializing them is (in principle) undefined behavior (or at least unspecified behavior ). 注意,在new分配的数组中使用单元而不进行初始化是(原则上) 未定义的行为 (或至少是未指定的行为 )。 So I believe a wise enough compiler could optimize your program to the equivalent of some abort() (or maybe just keep the printf ). 因此,我相信足够明智的编译器可以优化您的程序,使其等效于某些abort() (或仅保留printf )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM