简体   繁体   English

为什么Linux上的动态链接的可执行文件在其自己的内存空间中具有libc的完整内存空间?

[英]Why does a dynamically linked executable on Linux have the complete memory space of libc in its own memory space?

I compiled the following C code with gcc -Wall -m32 test.c -o test on a 64-bit Ubuntu system: 我在64位Ubuntu系统上使用gcc -Wall -m32 test.c -o test编译了以下C代码:

#include <stdio.h>
#include <stdlib.h>

int main() {
    char * buffer;
    buffer = (char*) malloc (1048576);
    printf("hi\n");
    sleep(20);
    return 0;
}

Now when I run the code and do a cat /proc/PID/maps to look at the virtual memory range the process is using I see the following: 现在,当我运行代码并执行cat /proc/PID/maps以查看进程正在使用的虚拟内存范围时,我将看到以下内容:

08048000-08049000 r-xp 00000000 08:06 3805439  /home/me/test
08049000-0804a000 r--p 00000000 08:06 3805439  /home/me/test
0804a000-0804b000 rw-p 00001000 08:06 3805439  /home/me/test
f7475000-f7577000 rw-p 00000000 00:00 0
f7577000-f7720000 r-xp 00000000 08:06 8002662  /lib/i386-linux-gnu/libc-2.19.so
f7720000-f7721000 ---p 001a9000 08:06 8002662  /lib/i386-linux-gnu/libc-2.19.so
f7721000-f7723000 r--p 001a9000 08:06 8002662  /lib/i386-linux-gnu/libc-2.19.so
f7723000-f7724000 rw-p 001ab000 08:06 8002662  /lib/i386-linux-gnu/libc-2.19.so
f7724000-f7727000 rw-p 00000000 00:00 0
f7746000-f7748000 rw-p 00000000 00:00 0
f7748000-f7749000 r-xp 00000000 00:00 0        [vdso]
f7749000-f7769000 r-xp 00000000 08:06 8002671  /lib/i386-linux-gnu/ld-2.19.so
f7769000-f776a000 r--p 0001f000 08:06 8002671  /lib/i386-linux-gnu/ld-2.19.so
f776a000-f776b000 rw-p 00020000 08:06 8002671  /lib/i386-linux-gnu/ld-2.19.so
ffa60000-ffa81000 rw-p 00000000 00:00 0        [stack]

So the code area is between 08048000 and 0804b000, then there's the 1048576 Bytes on the heap for the buffer in f7475000-f7577000. 因此,代码区域在08048000和0804b000之间,然后f7475000-f7577000中的缓冲区堆上有1048576字节。 But then between f7577000 and f7724000 there are roughly 1758972 Bytes for the dynamically linked libc (thats pretty much the size of the library on the HDD). 但是在f7577000和f7724000之间,大约有1758972字节用于动态链接的libc(这几乎相当于HDD上库的大小)。 Why is that? 这是为什么? Same thing with ld a bit lower. 与ld相同的东西要低一些。

  1. Why does the system map the whole libc and ld shared objects into the process' memory range? 为什么系统将整个libc和ld共享对象映射到进程的内存范围? I thought there would only be a pointer to the libc which is loaded in memory only once systemwide? 我以为只会有一个指向libc的指针,该指针仅在系统范围内才被加载到内存中?

  2. Furthermore I definitely don't need the whole 1758972 Bytes in my memory. 此外,我绝对不需要内存中的整个1758972字节。 What's happening here? 这里发生了什么事?

  3. Is /lib/i386-linux-gnu/libc-2.19.so in memory only once systemwide? 系统范围内的内存中是否只有/lib/i386-linux-gnu/libc-2.19.so?

  1. Why does the system map the whole libc and ld shared objects into the process' memory range? 为什么系统将整个libc和ld共享对象映射到进程的内存范围? I thought there would only be a pointer to the libc which is loaded in memory only once systemwide? 我以为只会有一个指向libc的指针,该指针仅在系统范围内才被加载到内存中?

It does map it once systemwide, then it maps those pages into each process' virtual memory address space. 它确实会在系统范围内映射它,然后将那些页面映射到每个进程的虚拟内存地址空间。 Those pages are shared by every process (at least the read-only parts) 这些页面由每个进程共享(至少是只读部分)

You can't just have a "pointer" because pointers can only refer to things in the process' own address space, so if a library wasn't in that address space, how would you dereference the pointer? 您不能只拥有一个“指针”,因为指针只能引用进程自己的地址空间中的事物,因此,如果库不在该地址空间中,您将如何取消引用指针? It would also mean the process needed to say "OK, I want this function, is that in my address space? No, but I have a pointer, so follow that" which would be much more complicated. 这也意味着需要说“好吧,我想要这个功能,是在我的地址空间中吗?不是,但是我有一个指针,所以要遵循它”的过程会更加复杂。 Instead the OS and MMU hardware perform the indirections and mapping needed to make it appear as a single flat address space for every process. 取而代之的是OS和MMU硬件执行所需的间接操作和映射,以使其对于每个进程都显示为单个平面地址空间。

  1. Furthermore I definitely don't need the whole 1758972 Bytes in my memory. 此外,我绝对不需要内存中的整个1758972字节。 What's happening here? 这里发生了什么事?

Since every process using libc.so gets the same pages, it's much more efficient just to map the whole thing once and share it, rather than figure out which specific pages are needed by each process. 由于使用libc.so的每个进程都具有相同的页面,因此仅将整个对象映射一次并共享它,而不是找出每个进程需要哪些特定页面,效率会更高。

  1. Is /lib/i386-linux-gnu/libc-2.19.so in memory only once systemwide? 系统范围内的内存中是否只有/lib/i386-linux-gnu/libc-2.19.so?

Yes, because the same pages are mapped into each process. 是的,因为相同的页面映射到每个进程。

This all applies to any ELF shared library, not just libc.so 这一切都适用于任何ELF共享库,而不仅仅是libc.so

You have to be able to access the elements in libc.so , so it has to be mapped into your memory. 您必须能够访问libc.so的元素,因此必须将其映射到您的内存中。 You can't access anything that isn't mapped. 您无法访问未映射的任何内容。

As to whether its present once or multiple times, you'll have to explain exactly what you mean. 关于它是一次还是多次,您必须确切地解释您的意思。 I will be mapped into the address space of every process which uses it (which is pretty much every process). 我将被映射到使用它的每个进程的地址空间(几乎每个进程)。 But only the parts which are actually used in each process will actually be loaded into main memory. 但是只有在每个过程中实际使用的部件才会实际加载到主存储器中。 And the text segment will be mapped directly from the .so file, without any backing store in the swap area. text段将直接从.so文件映射,而在交换区域中没有任何后备存储。 (I believe, at least. I've never actually looked at Linux, but this is the way most virtual memory work. It's why code in the library will usually have to be compiled with -fPIC .) (至少,我相信。我从未真正看过Linux,但这是大多数虚拟内存的工作方式。这就是为什么通常必须使用-fPIC编译库中的代码的原因。)

I thought there would only be a pointer to the libc which is loaded in memory only once systemwide? 我以为只会有一个指向libc的指针,该指针仅在系统范围内才被加载到内存中?

What would such a pointer point to? 这样的指针将指向什么? A pointer can only point to something that is in your memory space. 指针只能指向您的内存空间中的某物。

It actually is in memory only once system-wide, though. 但是,实际上它仅在系统范围内一次存在于内存中。 Through the magic of memory management, it can appear in multiple process's memory, but actually be in memory only once. 通过神奇的内存管理,它可以出现在多个进程的内存中,但实际上只能在内存中出现一次。

Mapping only the parts of libc that you actually use seems theoretically possible, but probably not worth the complexity, since there would be no benefit on a 64-bit machine. 从理论上讲,仅映射您实际使用的libc部分似乎是可行的,但可能不值得这样做,因为在64位计算机上没有任何好处。 It would involve significant additions to the mechanisms for handling shared libraries. 它将大大增加用于处理共享库的机制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM