[英]How the virtual address of the process memory is calculated?
I'm write the following program to examine process memory layout: 我正在编写以下程序来检查进程内存布局:
#include <stdio.h>
#include <string.h>
#include <sys/resource.h>
#include <sys/time.h>
#include <unistd.h>
#define CHAR_LEN 255
char filepath[CHAR_LEN];
char line[CHAR_LEN];
char address[CHAR_LEN];
char perms[CHAR_LEN];
char offset[CHAR_LEN];
char dev[CHAR_LEN];
char inode[CHAR_LEN];
char pathname[CHAR_LEN];
int main() {
printf("Hello world.\n");
sprintf(filepath, "/proc/%u/maps", (unsigned)getpid());
FILE *f = fopen(filepath, "r");
printf("%-32s %-8s %-10s %-8s %-10s %s\n", "address", "perms", "offset",
"dev", "inode", "pathname");
while (fgets(line, sizeof(line), f) != NULL) {
sscanf(line, "%s%s%s%s%s%s", address, perms, offset, dev, inode, pathname);
printf("%-32s %-8s %-10s %-8s %-10s %s\n", address, perms, offset, dev,
inode, pathname);
}
fclose(f);
return 0;
}
I compile the program as gcc -static -O0 -g -std=gnu11 -o test_helloworld_memory_map test_helloworld_memory_map.c -lpthread
. 我将程序编译为gcc -static -O0 -g -std=gnu11 -o test_helloworld_memory_map test_helloworld_memory_map.c -lpthread
。 I first run readelf -l test_helloworld_memory_map
and obtain: 我首先运行readelf -l test_helloworld_memory_map
并获取:
Elf file type is EXEC (Executable file)
Entry point 0x400890
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000c9e2e 0x00000000000c9e2e R E 200000
LOAD 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000001c98 0x0000000000003db0 RW 200000
NOTE 0x0000000000000190 0x0000000000400190 0x0000000000400190
0x0000000000000044 0x0000000000000044 R 4
TLS 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000000020 0x0000000000000050 R 8
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000000148 0x0000000000000148 R 1
Section to Segment mapping:
Segment Sections...
00 .note.ABI-tag .note.gnu.build-id .rela.plt .init .plt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata __libc_subfreeres __libc_atexit .stapsdt.base __libc_thread_subfreeres .eh_frame .gcc_except_table
01 .tdata .init_array .fini_array .jcr .data.rel.ro .got .got.plt .data .bss __libc_freeres_ptrs
02 .note.ABI-tag .note.gnu.build-id
03 .tdata .tbss
04
05 .tdata .init_array .fini_array .jcr .data.rel.ro .got
Then, I run the program and obtain: 然后,我运行程序并获得:
address perms offset dev inode pathname
00400000-004ca000 r-xp 00000000 fd:01 12551992 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
006c9000-006cc000 rw-p 000c9000 fd:01 12551992 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
006cc000-006ce000 rw-p 00000000 00:00 0 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
018ac000-018cf000 rw-p 00000000 00:00 0 [heap]
7ffc2845c000-7ffc2847d000 rw-p 00000000 00:00 0 [stack]
7ffc28561000-7ffc28563000 r--p 00000000 00:00 0 [vvar]
7ffc28563000-7ffc28565000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
I'm confused about why the virtual address of memory segment is different from one shown in "/proc/[pid]/maps". 我很困惑为什么内存段的虚拟地址与“/ proc / [pid] / maps”中显示的不同。 For example, the virtual address of the 2nd memory segment is 0xc9eb8
shown by readelf
but in the process memory, it is calculated to 0x6c9000
. 例如,第二个存储器段的虚拟地址是0xc9eb8
所示的readelf
但在进程存储器中,它被计算为0x6c9000
。 How's this calculation is done? 怎么做这个计算?
I know the linker specifies 0x400000
as the starting address of the first memory segment and process memory shows address aligned to the page size (4K) (eg, 0xc9e2e
is aligned to 0xca000
plus 0x400000
). 我知道链接器指定0x400000
作为第一个内存段的起始地址,进程内存显示与页面大小(4K)对齐的地址(例如, 0xc9e2e
对齐到0xca000
加0x400000
)。 I think this has something to do with "Align" column shown by readelf
. 我认为这与readelf
显示的“Align”列readelf
。 However, reading ELF header makes me confuse: 但是,阅读ELF标题让我感到困惑:
p_align This member holds the value to which the segments are aligned in memory and in the file. Loadable process seg‐ ments must have congruent values for p_vaddr and p_offset, modulo the page size. Values of zero and one mean no alignment is required. Otherwise, p_align should be a pos‐ itive, integral power of two, and p_vaddr should equal p_offset, modulo p_align.
In specific, what does the last sentence means?: 具体来说,最后一句是什么意思?:
Otherwise, p_align should be a positive, integral power of two, and p_vaddr should equal p_offset, modulo p_align. 否则,p_align应该是2的正整数幂,并且p_vaddr应该等于p_offset,modulo p_align。
What's the calculation formula it is talking about? 它所说的计算公式是什么?
Thanks much! 非常感谢!
CPU address mapping has a "page" granularity, 4K is still a very common page size. CPU地址映射具有“页面”粒度,4K仍然是一个非常常见的页面大小。 /proc/$pid/maps
shows you the OS mappings, it doesn't show you what addresses the process actually cares about inside the mapped ranges. /proc/$pid/maps
显示操作系统映射,它不会显示进程在映射范围内实际关注的地址。 Your process only cares about what starts at offset eb8
into the first mapped page, but the CPU (and hence the OS that's controlling it for you) can't be bothered to map down to byte granularity, and the linker knows it, so it sets up the disk file with cpu-page-sized blocks. 你的进程只关心从偏移eb8
开始到第一个映射页面的内容,但CPU(以及因此为你控制它的操作系统)不能打算映射到字节粒度,并且链接器知道它,所以它使用cpu-page-sized块设置磁盘文件。
It means that for other than loadable segments, ie those without LOAD
, the last n
bits in the offset must match the last n
in virtual address; 这意味着,比其他可装入段,即那些没有LOAD
,最后n
在偏移位必须最后匹配n
在虚拟地址; and the value of the p_align
field is the 1 << n
. 并且p_align
字段的值是1 << n
。
For example, the stack says it can be placed anywhere, just that the address needs to be 16-aligned. 例如,堆栈表示它可以放在任何地方,只需要地址需要16对齐。
For loadable they need to be at least page-aligned. 对于可加载,它们至少需要页面对齐。 Take the second one from your example: 从你的例子中取出第二个:
Offset VirtAddr
LOAD 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000001c98 0x0000000000003db0 RW 200000
Given page size of 4096, the last 12 bits of the offset must be the same as the last 12 bits of the virtual address . 给定页面大小4096, 偏移的最后12位必须与虚拟地址的最后12位相同 。 This is because a dynamic linker usually uses mmap
to map the pages directly from the file into memory, and this can be only page-granular. 这是因为动态链接器通常使用mmap
将页面直接从文件映射到内存中,这可能只是页面粒度。 So in fact the dynamic linker did map the first part of this range from the file. 实际上,动态链接器确实从文件映射了该范围的第一部分。
006c9000-006cc000 rw-p 000c9000 fd:01 12551992
/home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
Further see that the file size is less than virtual size - the rest of the data will be zero mapped in the other mapping: 进一步看到文件大小小于虚拟大小 - 其余数据将在其他映射中映射为零:
006cc000-006ce000 rw-p 00000000 00:00 0
/home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
If you read the bytes at 0x00000000006c9000 - 0x00000000006c9eb7
you should see the exact same bytes as those at 0x00000000004c9000 - 0x00000000006c9eb7
, this is because the data segment and code segment come right after each other in the file without padding - this saves lots of disk space and actually helps saving ram too because the executable takes less space in the block device caches! 如果您在读取的字节中0x00000000006c9000 - 0x00000000006c9eb7
你应该看到完全相同的字节那些在0x00000000004c9000 - 0x00000000006c9eb7
,这是因为数据段和代码段之后的文件中对方来的权利没有填充-这可以节省大量的磁盘空间,实际上也有助于保存ram,因为可执行文件在块设备缓存中占用的空间更少!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.