简体   繁体   English

堆栈,缓存未命中和虚拟内存

[英]stack, cache misses and virtual memory

In general for what I remember the stack of a Program it is a special portion of the memory handled in a special way (by the means of a LIFO structure ie a 'stack'). 通常,对于我记得的stack ,它是内存中以特殊方式(通过LIFO结构即“堆栈”)处理的特殊部分。

I am working in Linux in C and C++ and I am not sure about the following points 我在使用C和C ++的Linux中工作,不确定以下几点

  1. being the stack a piece of general memory, does it mean that in a Linux process it is supposed to be in some page of the Virtual Memory of that process? 作为堆栈中的一块普通内存,是否意味着在Linux进程中它应该位于该进程的虚拟内存的某个页面中?

  2. I am used to know that if a piece of memory (I always thought about heap only) resides in the L1 Cache will be quicker to retrieve than L3 Cache. 我曾经知道,如果L1缓存中留有一块内存(我一直只想着堆),那么它将比L3缓存更快地进行检索。 Does it apply to the stack as well? 它也适用于堆栈吗?

Now stack is usually faster than heap, but if point 2 is true, still some data from the stack might be in L3 line and introduce slowness in the system. 现在堆栈通常比堆快,但是如果第2点为真,则堆栈中的某些数据可能仍在L3行中,从而导致系统运行缓慢。

Am I right in reasoning in the following terms or am I am missing something? 我在以下几个方面的推理是否正确?或者我缺少某些东西?

It is processor specific: AMD and Intel are doing different things, and even within each brand it is model specific. 它是特定于处理器的:AMD和英特尔所做的事情是不同的,甚至在每个品牌内,它都是特定于模型的。

Some processors (I forgot which, perhaps older AMD) are relating stack machine instructions (ie PUSH , POP , RET and CALL etc...) to the L1 cache. 一些处理器(我忘记了,也许是较早的AMD)将堆栈指令(即PUSHPOPRETCALL等)与L1高速缓存相关。

BTW, Andrew Appel wrote (in the previous century) garbage collection can be faster than stack allocation (for SML compiled using CPS techniques), but, IIRC, this is less true today because current processors have some behavior relating the call stack to the cache. 顺便说一句,安德鲁·阿佩尔(Andrew Appel)写道(在上个世纪), 垃圾回收可能比堆栈分配 (对于使用CPS技术编译的SML) 更快 ,但是,IIRC在今天已经不那么正确了,因为当前的处理器具有将调用堆栈与缓存相关联的某些行为。 。

But I believe that hot pieces of the call stack are often in L1 cache (even without hardware to help that), because the data there (local variables, return addresses, ...) is often accessed. 但我相信,调用堆栈的热门部分通常位于L1高速缓存中(即使没有硬件也可以提供帮助),因为那里的数据(局部变量,返回地址等)经常会被访问。

Of course, the call stack is in virtual memory ; 当然,调用堆栈位于虚拟内存中 use proc(5) , eg try 使用proc(5) ,例如尝试

 tail /proc/$$/maps

(you could use cat instead of tail ) to obtain perhaps: (您可以使用cat而不是tail )来获得:

7f6366db5000-7f6366dd5000 r-xp 00000000 08:11 2100860                    /lib/x86_64-linux-gnu/ld-2.19.so
7f6366fac000-7f6366fb0000 rw-p 00000000 00:00 0 
7f6366fcc000-7f6366fd3000 r--s 00000000 08:11 964796                     /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
7f6366fd3000-7f6366fd5000 rw-p 00000000 00:00 0 
7f6366fd5000-7f6366fd6000 r--p 00020000 08:11 2100860                    /lib/x86_64-linux-gnu/ld-2.19.so
7f6366fd6000-7f6366fd7000 rw-p 00021000 08:11 2100860                    /lib/x86_64-linux-gnu/ld-2.19.so
7f6366fd7000-7f6366fd8000 rw-p 00000000 00:00 0 
7fff59aa1000-7fff59ac2000 rw-p 00000000 00:00 0                          [stack]
7fff59bfe000-7fff59c00000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Notice the [stack] segment. 注意[stack]段。

Read also about ASLR & vdso(7) 另请参阅有关ASLRvdso(7)的信息。

By definition (of CPU caches ), the L1 cache contain usually the most often accessed data. 根据( CPU缓存的 )定义,L1缓存通常包含最常访问的数据。 Cache misses are costly (an access to data in your RAM sticks can be 100x slower than to L1 cache). 高速缓存未命中代价高昂(访问RAM棒中的数据的速度可能比一级高速缓存慢100倍)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM