简体   繁体   English

进程的虚拟地址空间的哪些部分是可覆盖的?

[英]What parts of a process' virtual address space are overwriteable?

For instance, lets suppose that instead of buffers growing in the opposite direction of the stack, they grow in the same direction. 例如,让我们假设缓冲区不是沿堆栈的相反方向增长,而是沿相同的方向增长。 If I have a character buffer containing the string "Hello world", instead of 'H' being placed at the lowest address, it is placed at the highest address, and so on. 如果我有一个包含字符串“ Hello world”的字符缓冲区,而不是将“ H”放置在最低地址,则将其放置在最高地址,依此类推。

If an input string copied to a buffer were to overflow it could not overwrite the return address of the function, but certainly there are other things it could overwrite. 如果复制到缓冲区的输入字符串溢出,则无法覆盖该函数的返回地址,但是肯定还有其他事情可以覆盖。 My question is -- if the input string was long enough, what things could be overwritten? 我的问题是-如果输入字符串足够长,哪些内容可能会被覆盖? Are there library functions that exist between the heap and the stack that could be overwritten? 堆和堆栈之间是否存在可能会被覆盖的库函数? Can heap variables be overwritten? 堆变量可以覆盖吗? I assume that variables in the data and bss sections can be overwritten, but is the text segment protected from writes? 我假设data和bss部分中的变量可以被覆盖,但是该文本段是否受到写保护?

The answer to your question depends entirely on what operating system is being used, as well as what hardware architecture. 您问题的答案完全取决于所使用的操作系统以及硬件体系结构。 The operating system lays out logical memory in a certain fashion, and the architecture sometimes reserves (very low) memory for specific purposes as well. 操作系统以某种方式布置逻辑内存,而体系结构有时也为特定目的保留(非常低的)内存。

One thing to understand is that traditional processes can access their entire logical memory space, but very little of this capacity is typically used. 要了解的一件事是,传统进程可以访问其整个逻辑存储空间,但是通常很少使用这种容量。 The most likely effect of what you describe is that you'll try to access some unallocated memory and you'll get a segfault in response, crashing your program. 您所描述的最可能的结果是,您将尝试访问一些未分配的内存,并且您将得到段错误作为响应,从而使程序崩溃。

That said, you definitely can modify these other segments of memory, but what happens when you do so depends on their read/write permissions. 就是说,您绝对可以修改内存的其他这些段,但是这样做时会发生什么取决于它们的读/写权限。 For example, the typical memory layout you learn in school is: 例如,您在学校学习的典型内存布局是:

Low memory to high memory:
.text - program code
.data - initialized static variables
.bss  - uninitialized static variables
.heap - grows up
memory map segments - dynamic libraries
.stack - grows down

The .text segment is marked read only / executable by default, so if you attempt to write to a .text memory location you'll get a segmentation fault. 默认情况下,.text段标记为只读/可执行,因此,如果尝试写入.text内存位置,则会遇到分段错误。 It's possible to change .text to writeable, but this is in general a terrible idea. 可以将.text更改为可写,但这通常是一个糟糕的主意。

The .data, .bss, .heap, and .stack segments are all readable/writeable by default, so you can overwrite those sections without any program errors. 默认情况下,.data,.bss,.heap和.stack段都是可读/可写的,因此您可以覆盖这些段而不会出现任何程序错误。

The memory map segment(s) all have their own permissions to deal with as well. 内存映射段均具有自己的权限来处理。 Some of these segments are writeable, most are not (so writing to them creates segfaults). 这些段中的某些段是可写的,而大多数段是不可写的(因此,对它们的写入会产生段错误)。

The last thing to note is that most modern OSes will randomize the locations of these segments to make things more difficult for hackers. 最后要注意的是,大多数现代OS都会随机分配这些段的位置,从而使黑客更加困难。 This may introduce gaps between different segments (which will again cause segfaults if you try to access them). 这可能会在不同的段之间引入间隙(如果尝试访问它们,将会再次导致段错误)。

On Linux, you can print out a process' memory map with the command pmap . 在Linux上,您可以使用命令pmap打印出进程的内存映射。 The following is the output of this program on an instance of vim: 以下是该程序在vim实例上的输出:

10636:   vim hello.text
0000000000400000   2112K r-x-- vim
000000000080f000      4K r---- vim
0000000000810000     88K rw--- vim
0000000000826000     56K rw---   [ anon ]
0000000000851000   2228K rw---   [ anon ]
00007f7df24c6000   8212K r--s- passwd
00007f7df2ccb000     32K r-x-- libnss_sss.so.2
00007f7df2cd3000   2044K ----- libnss_sss.so.2
00007f7df2ed2000      4K r---- libnss_sss.so.2
00007f7df2ed3000      4K rw--- libnss_sss.so.2
00007f7df2ed4000     48K r-x-- libnss_files-2.17.so
00007f7df2ee0000   2044K ----- libnss_files-2.17.so
00007f7df30df000      4K r---- libnss_files-2.17.so
00007f7df30e0000      4K rw--- libnss_files-2.17.so
00007f7df30e1000     24K rw---   [ anon ]
00007f7df30e7000 103580K r---- locale-archive
00007f7df960e000      8K r-x-- libfreebl3.so
00007f7df9610000   2044K ----- libfreebl3.so
00007f7df980f000      4K r---- libfreebl3.so
00007f7df9810000      4K rw--- libfreebl3.so
00007f7df9811000      8K r-x-- libutil-2.17.so
00007f7df9813000   2044K ----- libutil-2.17.so
00007f7df9a12000      4K r---- libutil-2.17.so
00007f7df9a13000      4K rw--- libutil-2.17.so
00007f7df9a14000     32K r-x-- libcrypt-2.17.so
00007f7df9a1c000   2044K ----- libcrypt-2.17.so
00007f7df9c1b000      4K r---- libcrypt-2.17.so
00007f7df9c1c000      4K rw--- libcrypt-2.17.so
00007f7df9c1d000    184K rw---   [ anon ]
00007f7df9c4b000     88K r-x-- libnsl-2.17.so
00007f7df9c61000   2044K ----- libnsl-2.17.so
00007f7df9e60000      4K r---- libnsl-2.17.so
00007f7df9e61000      4K rw--- libnsl-2.17.so
00007f7df9e62000      8K rw---   [ anon ]
00007f7df9e64000     88K r-x-- libresolv-2.17.so
00007f7df9e7a000   2048K ----- libresolv-2.17.so
00007f7dfa07a000      4K r---- libresolv-2.17.so
00007f7dfa07b000      4K rw--- libresolv-2.17.so
00007f7dfa07c000      8K rw---   [ anon ]
00007f7dfa07e000    152K r-x-- libncurses.so.5.9
00007f7dfa0a4000   2044K ----- libncurses.so.5.9
00007f7dfa2a3000      4K r---- libncurses.so.5.9
00007f7dfa2a4000      4K rw--- libncurses.so.5.9
00007f7dfa2a5000     16K r-x-- libattr.so.1.1.0
00007f7dfa2a9000   2044K ----- libattr.so.1.1.0
00007f7dfa4a8000      4K r---- libattr.so.1.1.0
00007f7dfa4a9000      4K rw--- libattr.so.1.1.0
00007f7dfa4aa000    144K r-x-- liblzma.so.5.0.99
00007f7dfa4ce000   2044K ----- liblzma.so.5.0.99
00007f7dfa6cd000      4K r---- liblzma.so.5.0.99
00007f7dfa6ce000      4K rw--- liblzma.so.5.0.99
00007f7dfa6cf000    384K r-x-- libpcre.so.1.2.0
00007f7dfa72f000   2044K ----- libpcre.so.1.2.0
00007f7dfa92e000      4K r---- libpcre.so.1.2.0
00007f7dfa92f000      4K rw--- libpcre.so.1.2.0
00007f7dfa930000   1756K r-x-- libc-2.17.so
00007f7dfaae7000   2048K ----- libc-2.17.so
00007f7dface7000     16K r---- libc-2.17.so
00007f7dfaceb000      8K rw--- libc-2.17.so
00007f7dfaced000     20K rw---   [ anon ]
00007f7dfacf2000     88K r-x-- libpthread-2.17.so
00007f7dfad08000   2048K ----- libpthread-2.17.so
00007f7dfaf08000      4K r---- libpthread-2.17.so
00007f7dfaf09000      4K rw--- libpthread-2.17.so
00007f7dfaf0a000     16K rw---   [ anon ]
00007f7dfaf0e000   1548K r-x-- libperl.so
00007f7dfb091000   2044K ----- libperl.so
00007f7dfb290000     16K r---- libperl.so
00007f7dfb294000     24K rw--- libperl.so
00007f7dfb29a000      4K rw---   [ anon ]
00007f7dfb29b000     12K r-x-- libdl-2.17.so
00007f7dfb29e000   2044K ----- libdl-2.17.so
00007f7dfb49d000      4K r---- libdl-2.17.so
00007f7dfb49e000      4K rw--- libdl-2.17.so
00007f7dfb49f000     20K r-x-- libgpm.so.2.1.0
00007f7dfb4a4000   2048K ----- libgpm.so.2.1.0
00007f7dfb6a4000      4K r---- libgpm.so.2.1.0
00007f7dfb6a5000      4K rw--- libgpm.so.2.1.0
00007f7dfb6a6000     28K r-x-- libacl.so.1.1.0
00007f7dfb6ad000   2048K ----- libacl.so.1.1.0
00007f7dfb8ad000      4K r---- libacl.so.1.1.0
00007f7dfb8ae000      4K rw--- libacl.so.1.1.0
00007f7dfb8af000    148K r-x-- libtinfo.so.5.9
00007f7dfb8d4000   2048K ----- libtinfo.so.5.9
00007f7dfbad4000     16K r---- libtinfo.so.5.9
00007f7dfbad8000      4K rw--- libtinfo.so.5.9
00007f7dfbad9000    132K r-x-- libselinux.so.1
00007f7dfbafa000   2048K ----- libselinux.so.1
00007f7dfbcfa000      4K r---- libselinux.so.1
00007f7dfbcfb000      4K rw--- libselinux.so.1
00007f7dfbcfc000      8K rw---   [ anon ]
00007f7dfbcfe000   1028K r-x-- libm-2.17.so
00007f7dfbdff000   2044K ----- libm-2.17.so
00007f7dfbffe000      4K r---- libm-2.17.so
00007f7dfbfff000      4K rw--- libm-2.17.so
00007f7dfc000000    132K r-x-- ld-2.17.so
00007f7dfc1f8000     40K rw---   [ anon ]
00007f7dfc220000      4K rw---   [ anon ]
00007f7dfc221000      4K r---- ld-2.17.so
00007f7dfc222000      4K rw--- ld-2.17.so
00007f7dfc223000      4K rw---   [ anon ]
00007ffcb46e7000    132K rw---   [ stack ]
00007ffcb475f000      8K r-x--   [ anon ]
ffffffffff600000      4K r-x--   [ anon ]
 total           163772K

The segment starting at 0x851000 is actually the start of the heap (which pmap will tell you with more verbose reporting modes, but the more verbose mode didn't fit). 从0x851000开始的段实际上是堆的开始(pmap会告诉您更多详细的报告模式,但是更详细的模式不合适)。

The layout of processes in memory varies from system to system. 内存中的进程布局因系统而异。 This answer covers Linux under x86_64 processors. 该答案涵盖了x86_64处理器下的Linux。

There is a nice article illustrating the memory layout for Linux processes here . 有说明Linux的进程的内存布局的好文章在这里

If the buffer is a local variable, then it will be on the stack, along with other local variables. 如果缓冲区是局部变量,则它将与其他局部变量一起放在堆栈中。 The first thing you are likely to hit if you overflow the buffer is other local variables in the same function. 如果缓冲区溢出,您可能会遇到的第一件事是同一函数中的其他局部变量。

When you reach the end of the stack, there is a randomly sized offset before the next used segment of memory. 当您到达堆栈末尾时,在下一个使用的内存段之前会有一个随机大小的偏移量。 If you continue writing into this address space you would trigger a segfault (since that address space is not mapped to any physical RAM). 如果继续写入该地址空间,则会触发段错误(因为该地址空间未映射到任何物理RAM)。

Assuming you managed to skip over the random offset without crashing, and continued overwriting, the next thing it might hit is the memory mapping segment. 假设您设法跳过了随机偏移而没有崩溃,并继续覆盖,那么接下来可能遇到的是内存映射段。 This segment contains file mappings, including those used to map dynamic shared libraries into the address space, and anonymous mappings. 此段包含文件映射,包括用于将动态共享库映射到地址空间的文件映射和匿名映射。 The dynamic libraries are going to be read-only, but if the process had any RW mappings in place you could perhaps overwrite data in them. 动态库将是只读的,但是如果该进程具有适当的RW映射,则可能会覆盖其中的数据。

After this segment comes another random offset before you hit the heap. 在此段之后,出现另一个随机偏移,然后再击中堆。 Again if you tried to write into the address space of the random offset you would trigger a crash. 同样,如果您尝试写入随机偏移量的地址空间,则会触发崩溃。

Below the heap comes another random offset, followed by the BSS, Data and finally text segments. 堆下面是另一个随机偏移量,其后是BSS,数据和最后是文本段。 Static variables within BSS and Data could be overwritten. BSS和数据中的静态变量可能会被覆盖。 The text segment should not be writable. 文本段不应是可写的。

You can inspect the memory map of a process using the pmap command. 您可以使用pmap命令检查进程的内存映射。

I think your question reflects a fundamental misunderstanding of how things work in an operating system. 我认为您的问题反映了对事物在操作系统中如何工作的根本误解。 Things like "buffers" and "stack" tend not to be defined by the operating system. 诸如“缓冲区”和“堆栈”之类的内容通常不会由操作系统定义。

The operating system divides memory into kernel and user areas (and some systems have additional, protected areas). 操作系统将内存分为内核区域和用户区域(某些系统具有附加的受保护区域)。

The layout of the user area is usually defined by the linker. 用户区域的布局通常由链接器定义。 The linker creates executables that instruct the loader how to set up the address space. 链接器创建可执行文件,这些可执行文件指示加载程序如何设置地址空间。 Various linkers have different levels of control. 各种连接器具有不同的控制级别。 Generally, the default linker settings group the sections of program as something like: 通常,默认链接器设置将程序的各个部分分组为:

-Read/execute - 读取/执行

-Read/no execute -读/不执行

-Read/write/initialized - 读取/写入/初始化

-Read/write/demand zero -读/写/要求零

One some linkers you can create multiple program sections with these attributes. 一些链接器可以使用这些属性创建多个程序部分。

You ask: 你问:

"If I have a character buffer containing the string "Hello world", instead of 'H' being placed at the lowest address, it is placed at the highest address, and so on." “如果我有一个包含字符串“ Hello world”的字符缓冲区,而不是将“ H”放置在最低地址,则将其放置在最高地址,依此类推。

In a van neumann machine, memory is independent of its usage. 在范诺依曼机器中,内存与其使用情况无关。 The same memory block can simultaneously be interpreted as a string, floating point, integer, or instruction. 同一内存块可以同时解释为字符串,浮点数,整数或指令。 You can put your letter in any order you want but most software libraries would not recognize them in reverse order. 您可以将字母以任意顺序放置,但是大多数软件库不会以相反的顺序识别它们。 IF your own libraries can handle the strings stored backwards, knock yourself out. 如果您自己的库可以处理向后存储的字符串,请自行淘汰。

"My question is -- if the input string was long enough, what things could be overwritten?" “我的问题是-如果输入字符串足够长,哪些内容可能会被覆盖?”

It could be anything. 可能是任何东西。

"Are there library functions that exist between the heap and the stack that could be overwritten?" “堆和堆栈之间是否存在可以覆盖的库函数?”

That depends upon what your linker did. 这取决于您的链接器做了什么。

"Can heap variables be overwritten?" “堆变量可以覆盖吗?”

Heap can be overwritten. 堆可以被覆盖。

"I assume that variables in the data and bss sections can be overwritten, but is the text segment protected from writes? “我假设data和bss部分中的变量可以被覆盖,但是文本段是否受到写保护?

Generally, yes. 通常,是的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM