简体   繁体   English

C / C ++程序可以通过读取数组末尾(UNIX)来解决错误吗?

[英]Can a C/C++ program seg-fault from reading past the end of an array (UNIX)?

I'm aware that you can read past the end of an array - I'm wondering now if you can seg-fault just by performing that reading operation though. 我知道你可以读过一个数组的结尾 - 我现在想知道你是否可以通过执行读操作来解决错误。

int someints[100];
std::cerr << someints[100] << std::endl; //This is 1 past the end of the array.

Can the second line actually cause a seg-fault or will it just print jibberish? 第二行真的可能导致段错误还是仅仅打印乱码? Also, if I changed that memory, can that cause a seg-fault on that specific line , or would a fault only happen later when something else tried to use that accidentally changed memory? 此外,如果我更改了那个内存,是否会导致该特定行上出现seg-fault,或者只有在其他事情试图使用该意外更改的内存时才会发生故障?

This is undefined behaviour and entirely depends on the virtual memory layout the operating system has arranged for the process. 这是未定义的行为,完全取决于操作系统为进程安排的虚拟内存布局。 Generally you can either: 通常你可以:

  • access some gibberish that belongs to your virtual address space but has a meaningless value, or 访问属于您的虚拟地址空间但有无意义值的一些乱码,或者
  • attempt to access a restricted memory address in which case the memory mapping hardware invokes a page fault and the OS decides whether to spank your process or allocate more memory. 尝试访问受限制的内存地址,在这种情况下,内存映射硬件会调用页面错误,操作系统会决定是否打击您的进程或分配更多内存。

If someints is an array on the stack and is the last variable declared, you will most likely get some gibberish off the top of the stack or (very unlikely) invoke a page fault that could either let the OS resize the stack or kill your process with a SIGSEGV . 如果someints是堆栈上的数组并且是声明的最后一个变量,那么很可能会从堆栈顶部获得一些乱码或者(非常不可能)调用可能让操作系统调整堆栈大小或者终止进程的页面错误用SIGSEGV

Imagine you declare a single int right after your array: 想象一下,你在数组之后声明了一个int

int someints[100];
int on_top_of_stack = 42;
std::cerr << someints[100] << std::endl;

Then most likely the program should print 42 , unless the compiler somehow rearranges the order of declarations on the stack. 然后很可能程序应该打印42 ,除非编译器以某种方式重新排列堆栈上的声明顺序。

Yes, it can segfault if memory at that address is not accessible by the program. 是的,如果程序无法访问该地址的内存,则可能会出现段错误。 In your case it is not likely as array is allocated on stack and is only 100 bytes long and stack size is significantly larger (ie 8 MB per thread on Linux 2.4.X), so there will be uninitialized data. 在你的情况下,它不太可能在堆栈上分配数组,并且只有100个字节长,堆栈大小明显更大(即Linux 2.4.X上每个线程8 MB),因此会有未初始化的数据。 But in some cases it may crash. 但在某些情况下它可能会崩溃。 In either case, this code is erroneous and profilers like Valgrind should be able to help you troubleshoot it. 在任何一种情况下,此代码都是错误的,像Valgrind这样的分析器应该能够帮助您排除故障。

The second line can cause literally anything to happen and still be correct as far as the language specification is concerned. 第二行可以导致任何事情发生,并且就语言规范而言仍然是正确的。 It could print gibberish, it could crash due to a segmentation fault or something else, it could cause power to go out on the entire eastern seaboard, or it could cause the canonical demons to fly out of your nose ... 它可以打印乱码,它可能因分段故障或其他原因而崩溃,它可能导致整个东海岸的电力消失,或者它可能导致规范的恶魔飞出你的鼻子 ......

That's the magic of undefined behaviour . 这是未定义行为的魔力。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM