预期的缓冲区溢出并不总是导致程序崩溃

Question

Consider The following minimal C program: 考虑以下最小C程序：

Case Number 1 : 案例编号1 ：

#include <stdio.h>
#include <string.h>

void foo(char* s)
{
    char buffer[10];
    strcpy(buffer,s);
}

int main(void)
{
    foo("01234567890134567");
}

This doesn't cause a crash dump 这不会导致崩溃转储

If add just one character, so the new main is: 如果只添加一个字符，那么新的主要是：

Case Number 2 : 案例2 ：

void main()
{
    foo("012345678901345678");
                          ^   
}

The program crashes with a Segmentation fault. 程序因Segmentation故障而崩溃。

Looks like additionally to the 10 characters reserved in the stack there's an additional room for 8 additional characters. 除了堆栈中保留的10个字符外，还有一个额外的空间可容纳8个额外字符。 Thus the first program doesn't crash. 因此第一个程序不会崩溃。 However, if you add one more character you start accessing invalid memory. 但是，如果再添加一个字符，则会开始访问无效内存。 My questions are: 我的问题是：

Why we do have these additional 8 characters reserved in the stack? 为什么我们在堆栈中保留了这些额外的8个字符？
Is this related somehow with the char data type alignment in the memory? 这与内存中的char数据类型对齐有何关联？

An other doubt I have in this case is how does the OS (Windows in this case) detects the bad memory access? 我在这种情况下的另一个疑问是操作系统（在这种情况下是Windows）如何检测到错误的内存访问？ Normally as per the Windows documentation the default stack size is 1MB Stack Size . 通常，根据Windows文档，默认堆栈大小为1MB 堆栈大小。 So I don't see how the OS detects that the address being accessed is outside the process memory specially when the minimum page size is normally 4k. 所以我没有看到操作系统如何检测到被访问的地址是否在进程内存之外，特别是当最小页面大小通常为4k时。 Does the OS use the SP in this case to check the address? 在这种情况下操作系统是否使用SP来检查地址？

PD: I'm using the following environment for the testing PD：我正在使用以下环境进行测试
Cygwin Cygwin的
GCC 4.8.3 GCC 4.8.3
Windows 7 OS Windows 7操作系统

EDIT : 编辑：

This is the generated assembly from http://gcc.godbolt.org/# but using GCC 4.8.2, I can't see the GCC 4.8.3 in the available compilers. 这是从http://gcc.godbolt.org/#生成的程序集，但是使用GCC 4.8.2，我在可用的编译器中看不到GCC 4.8.3。 But I guess the generated code should be similar. 但我想生成的代码应该是相似的。 I built the code without any flags. 我没有任何标志构建代码。 I hope somebody with Assembly expertise could shed some light about what's happening in the foo function and why the extra char causes the seg fault 我希望拥有汇编专业知识的人能够了解foo函数中发生的事情以及为什么额外的char会导致seg错误

    foo(char*):
    pushq   %rbp
    movq    %rsp, %rbp
    subq    $48, %rsp
    movq    %rdi, -40(%rbp)
    movq    %fs:40, %rax
    movq    %rax, -8(%rbp)
    xorl    %eax, %eax
    movq    -40(%rbp), %rdx
    leaq    -32(%rbp), %rax
    movq    %rdx, %rsi
    movq    %rax, %rdi
    call    strcpy
    movq    -8(%rbp), %rax
    xorq    %fs:40, %rax
    je  .L2
    call    __stack_chk_fail
.L2:
    leave
    ret
.LC0:
    .string "01234567890134567"
main:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    $.LC0, %edi
    call    foo(char*)
    movl    $0, %eax
    popq    %rbp
    ret

Answer 1

I believe you understand that you have implemented something that leads to Undefined Behavior. 我相信你明白你已经实现了导致未定义行为的东西。 So it is hard to answer why it fails with the extra string and doesn't with the original. 所以很难回答为什么它失败的额外字符串而不是原始字符串。 It is probably related to the internal compiler implementation + affected by the compilation flags (like alignments, optimizations, etc.). 它可能与内部编译器实现+受编译标志（如对齐，优化等）的影响有关。

You can try disassembling the binary or creating assembly code and seeing where exactly the buffer is put on the stack. 您可以尝试反汇编二进制文件或创建汇编代码，并查看缓冲区在堆栈中的确切位置。 You can do the same with different optimization levels to inspect the changes in the assembly code and the behavior. 您可以使用不同的优化级别执行相同操作，以检查程序集代码和行为中的更改。

how does the OS (Windows in this case) detects the bad memory access? 操作系统（在这种情况下是Windows）如何检测到错误的内存访问？ Normally as per the Windows documentation the default stack size is 1MB Stack Size. 通常，根据Windows文档，默认堆栈大小为1MB堆栈大小。 So I don't see how the OS detects that the address being accessed is outside the process memory specially when the minimum page size is normally 4k. 所以我没有看到操作系统如何检测到被访问的地址是否在进程内存之外，特别是当最小页面大小通常为4k时。 Does the OS use the SP in this case to check the address? 在这种情况下操作系统是否使用SP来检查地址？

The OS doesn't monitor the code you execute. 操作系统不会监视您执行的代码。 The HW (CPU) does (since it executes this code). HW（CPU）执行（因为它执行此代码）。 Once your code tries to access an address which was not allocated for your process (was not mapped by the OS for your program) the OS will get an indication since the HW will fire a #PF (page fault) exception. 一旦您的代码尝试访问未为您的进程分配的地址（未由您的程序的OS 映射），操作系统将获得指示，因为HW将触发#PF（页面错误）异常。 Another case is that you try to access an address which was allocated for you but with improper permissions (for example you try to execute binary data from a DATA page which has no 'execute' permission) or go to the CODE page but with a wrong offset and the instruction that you read doesn't exist or (even worse) it exists and decodes to something you don't expect (did we say Undefined Behavior before?). 另一种情况是您尝试访问为您分配但具有不正确权限的地址（例如，您尝试从没有'执行'权限的DATA页面执行二进制数据）或转到CODE页面但是有错误offset和你读的指令不存在或者（甚至更糟）它存在并解码为你不期望的东西（我们之前是否说过Undefined Behavior？）。

In general your code most likely doesn't fail on strcpy (it can if you write enough data to access some forbidden addresses but most likely it is not the case) - it fails when it returns from the foo function. 一般来说，你的代码很可能不会在strcpy失败（如果你写了足够的数据来访问一些禁止的地址，但很可能不是这种情况） - 当它从foo函数返回时它会失败。 strcpy just overwrote the next instruction pointer which points to the next instruction after the foo function. strcpy只是覆盖了指向foo函数后指向下一条指令的下一条指令指针。 So the instruction pointer is filled with the data from the "012345678901345678" string and tries to fetch the next instruction from the 'junky' address and fails due to the mentioned above reasons. 因此，指令指针用“012345678901345678”字符串中的数据填充，并尝试从'junky'地址获取下一条指令，并由于上述原因而失败。

This "method"/bug is called a " buffer overflow attack " and widely used among hackers to make your code (and more often OS/BIOS/VMM/SMM code which is executed with higher privileges) execute malicious code provided by the hacker. 这种“方法”/错误被称为“ 缓冲区溢出攻击 ”，并且在黑客中广泛使用以使您的代码（以及更常见的以更高权限执行的OS / BIOS / VMM / SMM代码）执行黑客提供的恶意代码。 Just make sure to overwrite the instruction pointer with the address of the code that you prepared in advance. 只需确保用您预先准备的代码的地址覆盖指令指针。

Answer 2

The official, system agnostic answer is: 官方，系统无关的答案是：

Your code writes data beyond the end of the destination array, the behaviour is undefined, anything can happen, including nothing at all or space probe crashed on Mars surface . 您的代码将数据写入目标数组的末尾，行为未定义，任何事情都可能发生，包括任何内容或空间探测器在火星表面上崩溃 。 Your observing no noticeable effect up to 8 bytes beyond the end of the buffer and a crash with a segmentation fault beyond that are possible effects of undefined behaviour, well within the expected outcome. 您的观察结果在缓冲区末端之外的8个字节内没有明显影响，并且超出此范围的分段故障崩溃可能是未定义行为的影响，完全在预期结果范围内。

The extra implementation details you are interested in: 您感兴趣的额外实施细节：

Actual behaviour will depend on many circumstances, for example which compiler you use, which OS and ABI (Application Binary Interface) etc. 实际行为取决于许多情况，例如您使用的编译器，OS和ABI（应用程序二进制接口）等。

Your program is compiled and executed in a 64 bit Windows environment. 您的程序在64位Windows环境中编译和执行。 In this environement, the stack is kept aligned on 64 bit boundaries, or possibly 16 byte boundaries to allow direct loading and storing of the MMX registers from/to stack locations. 在这个环境中，堆栈在64位边界或可能的16字节边界上保持对齐，以允许从/向堆栈位置直接加载和存储MMX寄存器。 The array buffer[10] occupies 16 bytes on the stack. 数组buffer[10]在堆栈上占用16个字节。 Given how the stack is established on this processor, it will be located just below locations used by function foo to store any saved registers and the return address into the caller function main . 给定如何在此处理器上建立堆栈，它将位于函数foo使用的位置下方，以将任何已保存的寄存器和返回地址存储到调用函数main 。 Whether the extra 6 bytes are before or after the array is a choice for the compiler to make. 额外的6个字节是在数组之前还是之后是编译器的选择。 It could use this space for other local variables or just ignore it. 它可以将此空间用于其他局部变量，或者只是忽略它。

Writing beyond the end of buffer may be harmless for up to 6 bytes if the padding is after the array, might not have any noticeable effect for another 8 bytes (clobbering the saved rbp register, which is unused in main after the call), but will start having bad side effects beyond that, because you will be overwriting the return address. 如果填充在数组之后，超出buffer末尾的buffer对于最多6个字节可能是无害的，对于另外8个字节可能没有任何明显的影响（破坏保存的rbp寄存器，在调用之后在main未使用），但是将会开始产生不良副作用，因为你将覆盖返回地址。

When you overwrite the return address, the processor will not return from function foo to the caller main , but to whatever address is stored on the stack and was corrupted by the offending code. 当您覆盖返回地址时，处理器不会从函数foo返回到调用者main ，而是返回到存储在堆栈中的任何地址，并且被违规代码破坏。 If this corrupted address points to executable code, that code will be executed with potential harmful consequences... Hackers do exactly this: they carefuly craft an exploit that manages to store some harmful code at a known location in executable memory and take advantage of the buffer overflow code to store the address of said code in the stack location for the return address. 如果这个损坏的地址指向可执行代码，那么该代码将被执行并带来潜在的有害后果......黑客正是这样做的：他们非常谨慎地制作一个漏洞，设法将某些有害代码存储在可执行内存中的已知位置并利用缓冲区溢出代码，用于将所述代码的地址存储在返回地址的堆栈位置中。

In your case, the location pointed to by the corrupted return address might not be executable, triggering the segmentation fault you observe. 在您的情况下，损坏的返回地址指向的位置可能无法执行，从而触发您观察到的分段错误。

I suggest your try and compile your code on this site to see the actual assembly code generated under various compiler options: http://gcc.godbolt.org/# 我建议您尝试在此站点上编译代码，以查看在各种编译器选项下生成的实际汇编代码： http ： //gcc.godbolt.org/#

Answer 3

The next entry in stack is function address witch in 64 bit system must be aligned to 8, thus there is enough space for 16 characters. 堆栈中的下一个条目是64位系统中的函数地址，必须与8对齐，因此有足够的空间容纳16个字符。

You can verify this by declaring an int variable after array. 您可以通过在数组后声明一个int变量来验证这一点。 Int will be aligned to 4 and there will be less space for characters so program will crash on lower number. Int将与4对齐，并且字符空间将减少，因此程序将在较低的数字上崩溃。

预期的缓冲区溢出并不总是导致程序崩溃

问题描述

3 个解决方案

解决方案1
2 2015-10-08 11:07:31

解决方案2
2 2015-10-08 11:20:20

解决方案3
0 2015-10-08 11:00:04

预期的缓冲区溢出并不总是导致程序崩溃

问题描述

3 个解决方案

解决方案1 2 2015-10-08 11:07:31

解决方案2 2 2015-10-08 11:20:20

解决方案3 0 2015-10-08 11:00:04

解决方案1
2 2015-10-08 11:07:31

解决方案2
2 2015-10-08 11:20:20

解决方案3
0 2015-10-08 11:00:04