简体   繁体   English

如何计算进程内存的虚拟地址?

[英]How the virtual address of the process memory is calculated?

I'm write the following program to examine process memory layout: 我正在编写以下程序来检查进程内存布局:

#include <stdio.h>
#include <string.h>
#include <sys/resource.h>
#include <sys/time.h>
#include <unistd.h>

#define CHAR_LEN 255

char filepath[CHAR_LEN];
char line[CHAR_LEN];
char address[CHAR_LEN];
char perms[CHAR_LEN];
char offset[CHAR_LEN];
char dev[CHAR_LEN];
char inode[CHAR_LEN];
char pathname[CHAR_LEN];

int main() {
  printf("Hello world.\n");

  sprintf(filepath, "/proc/%u/maps", (unsigned)getpid());
  FILE *f = fopen(filepath, "r");

  printf("%-32s %-8s %-10s %-8s %-10s %s\n", "address", "perms", "offset",
         "dev", "inode", "pathname");
  while (fgets(line, sizeof(line), f) != NULL) {
    sscanf(line, "%s%s%s%s%s%s", address, perms, offset, dev, inode, pathname);
    printf("%-32s %-8s %-10s %-8s %-10s %s\n", address, perms, offset, dev,
           inode, pathname);
  }

  fclose(f);
  return 0;
}

I compile the program as gcc -static -O0 -g -std=gnu11 -o test_helloworld_memory_map test_helloworld_memory_map.c -lpthread . 我将程序编译为gcc -static -O0 -g -std=gnu11 -o test_helloworld_memory_map test_helloworld_memory_map.c -lpthread I first run readelf -l test_helloworld_memory_map and obtain: 我首先运行readelf -l test_helloworld_memory_map并获取:

Elf file type is EXEC (Executable file)
Entry point 0x400890
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000c9e2e 0x00000000000c9e2e  R E    200000
  LOAD           0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
                 0x0000000000001c98 0x0000000000003db0  RW     200000
  NOTE           0x0000000000000190 0x0000000000400190 0x0000000000400190
                 0x0000000000000044 0x0000000000000044  R      4
  TLS            0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
                 0x0000000000000020 0x0000000000000050  R      8
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO      0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
                 0x0000000000000148 0x0000000000000148  R      1

 Section to Segment mapping:
  Segment Sections...
   00     .note.ABI-tag .note.gnu.build-id .rela.plt .init .plt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata __libc_subfreeres __libc_atexit .stapsdt.base __libc_thread_subfreeres .eh_frame .gcc_except_table
   01     .tdata .init_array .fini_array .jcr .data.rel.ro .got .got.plt .data .bss __libc_freeres_ptrs
   02     .note.ABI-tag .note.gnu.build-id
   03     .tdata .tbss
   04
   05     .tdata .init_array .fini_array .jcr .data.rel.ro .got

Then, I run the program and obtain: 然后,我运行程序并获得:

address                          perms    offset     dev      inode      pathname
00400000-004ca000                r-xp     00000000   fd:01    12551992   /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
006c9000-006cc000                rw-p     000c9000   fd:01    12551992   /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
006cc000-006ce000                rw-p     00000000   00:00    0          /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
018ac000-018cf000                rw-p     00000000   00:00    0          [heap]
7ffc2845c000-7ffc2847d000        rw-p     00000000   00:00    0          [stack]
7ffc28561000-7ffc28563000        r--p     00000000   00:00    0          [vvar]
7ffc28563000-7ffc28565000        r-xp     00000000   00:00    0          [vdso]
ffffffffff600000-ffffffffff601000 r-xp     00000000   00:00    0          [vsyscall]

I'm confused about why the virtual address of memory segment is different from one shown in "/proc/[pid]/maps". 我很困惑为什么内存段的虚拟地址与“/ proc / [pid] / maps”中显示的不同。 For example, the virtual address of the 2nd memory segment is 0xc9eb8 shown by readelf but in the process memory, it is calculated to 0x6c9000 . 例如,第二个存储器段的虚拟地址是0xc9eb8所示的readelf但在进程存储器中,它被计算为0x6c9000 How's this calculation is done? 怎么做这个计算?

I know the linker specifies 0x400000 as the starting address of the first memory segment and process memory shows address aligned to the page size (4K) (eg, 0xc9e2e is aligned to 0xca000 plus 0x400000 ). 我知道链接器指定0x400000作为第一个内存段的起始地址,进程内存显示与页面大小(4K)对齐的地址(例如, 0xc9e2e对齐到0xca0000x400000 )。 I think this has something to do with "Align" column shown by readelf . 我认为这与readelf显示的“Align”列readelf However, reading ELF header makes me confuse: 但是,阅读ELF标题让我感到困惑:

  p_align This member holds the value to which the segments are aligned in memory and in the file. Loadable process seg‐ ments must have congruent values for p_vaddr and p_offset, modulo the page size. Values of zero and one mean no alignment is required. Otherwise, p_align should be a pos‐ itive, integral power of two, and p_vaddr should equal p_offset, modulo p_align. 

In specific, what does the last sentence means?: 具体来说,最后一句是什么意思?:

Otherwise, p_align should be a positive, integral power of two, and p_vaddr should equal p_offset, modulo p_align. 否则,p_align应该是2的正整数幂,并且p_vaddr应该等于p_offset,modulo p_align。

What's the calculation formula it is talking about? 它所说的计算公式是什么?

Thanks much! 非常感谢!

CPU address mapping has a "page" granularity, 4K is still a very common page size. CPU地址映射具有“页面”粒度,4K仍然是一个非常常见的页面大小。 /proc/$pid/maps shows you the OS mappings, it doesn't show you what addresses the process actually cares about inside the mapped ranges. /proc/$pid/maps显示操作系统映射,它不会显示进程在映射范围内实际关注的地址。 Your process only cares about what starts at offset eb8 into the first mapped page, but the CPU (and hence the OS that's controlling it for you) can't be bothered to map down to byte granularity, and the linker knows it, so it sets up the disk file with cpu-page-sized blocks. 你的进程只关心从偏移eb8开始到第一个映射页面的内容,但CPU(以及因此为你控制它的操作系统)不能打算映射到字节粒度,并且链接器知道它,所以它使用cpu-page-sized块设置磁盘文件。

It means that for other than loadable segments, ie those without LOAD , the last n bits in the offset must match the last n in virtual address; 这意味着,比其他可装入段,即那些没有LOAD ,最后n在偏移位必须最后匹配n在虚拟地址; and the value of the p_align field is the 1 << n . 并且p_align字段的值是1 << n

For example, the stack says it can be placed anywhere, just that the address needs to be 16-aligned. 例如,堆栈表示它可以放在任何地方,只需要地址需要16对齐。

For loadable they need to be at least page-aligned. 对于可加载,它们至少需要页面对齐。 Take the second one from your example: 从你的例子中取出第二个:

               Offset             VirtAddr

LOAD           0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
               0x0000000000001c98 0x0000000000003db0  RW     200000

Given page size of 4096, the last 12 bits of the offset must be the same as the last 12 bits of the virtual address . 给定页面大小4096, 偏移最后12位必须与虚拟地址的最后12位相同 This is because a dynamic linker usually uses mmap to map the pages directly from the file into memory, and this can be only page-granular. 这是因为动态链接器通常使用mmap将页面直接从文件映射到内存中,这可能只是页面粒度。 So in fact the dynamic linker did map the first part of this range from the file. 实际上,动态链接器确实从文件映射了该范围的第一部分。

006c9000-006cc000                rw-p     000c9000   fd:01    12551992    
 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map

Further see that the file size is less than virtual size - the rest of the data will be zero mapped in the other mapping: 进一步看到文件大小小于虚拟大小 - 其余数据将在其他映射中映射为零:

006cc000-006ce000                rw-p     00000000   00:00    0                  
 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map

If you read the bytes at 0x00000000006c9000 - 0x00000000006c9eb7 you should see the exact same bytes as those at 0x00000000004c9000 - 0x00000000006c9eb7 , this is because the data segment and code segment come right after each other in the file without padding - this saves lots of disk space and actually helps saving ram too because the executable takes less space in the block device caches! 如果您在读取的字节中0x00000000006c9000 - 0x00000000006c9eb7你应该看到完全相同的字节那些在0x00000000004c9000 - 0x00000000006c9eb7 ,这是因为数据段和代码段之后的文件中对方来的权利没有填充-这可以节省大量的磁盘空间,实际上也有助于保存ram,因为可执行文件在块设备缓存中占用的空间更少!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM