简体   繁体   English

越界访问数组有多危险?

[英]How dangerous is it to access an array out of bounds?

How dangerous is accessing an array outside of its bounds (in C)?越界访问数组(在 C 语言中)有多危险? It can sometimes happen that I read from outside the array (I now understand I then access memory used by some other parts of my program or even beyond that) or I am trying to set a value to an index outside of the array.有时会发生我从数组外部读取的情况(我现在明白我然后访问我程序的其他部分甚至超出该部分使用的内存)或者我试图将一个值设置为数组外部的索引。 The program sometimes crashes, but sometimes just runs, only giving unexpected results.该程序有时会崩溃,但有时会运行,只会给出意想不到的结果。

Now what I would like to know is, how dangerous is this really?现在我想知道的是,这到底有多危险? If it damages my program, it is not so bad.如果它损坏了我的程序,那还不错。 If on the other hand it breaks something outside my program, because I somehow managed to access some totally unrelated memory, then it is very bad, I imagine.另一方面,如果它破坏了我程序之外的某些东西,因为我以某种方式设法访问了一些完全不相关的内存,那么我想那是非常糟糕的。 I read a lot of 'anything can happen', 'segmentation might be the least bad problem' , 'your hard disk might turn pink and unicorns might be singing under your window', which is all nice, but what is really the danger?我读了很多“任何事情都可能发生”、 “分段可能是最不坏的问题” 、“你的硬盘可能会变成粉红色,独角兽可能在你的窗下唱歌”,这些都很好,但真正的危险是什么?

My questions:我的问题:

  1. Can reading values from way outside the array damage anything apart from my program?从数组外部读取值会损坏我的程序以外的任何东西吗? I would imagine just looking at things does not change anything, or would it for instance change the 'last time opened' attribute of a file I happened to reach?我会想象仅仅看东西不会改变任何东西,或者它会改变我碰巧到达的文件的“上次打开”属性吗?
  2. Can setting values way out outside of the array damage anything apart from my program?设置数组之外的值是否会损坏我的程序以外的任何东西? From this Stack Overflow question I gather that it is possible to access any memory location, that there is no safety guarantee.从这个Stack Overflow 问题中,我了解到可以访问任何内存位置,但没有安全保证。
  3. I now run my small programs from within XCode.我现在从 XCode 中运行我的小程序。 Does that provide some extra protection around my program where it cannot reach outside its own memory?这是否为我的程序提供了一些额外的保护,使它无法到达自己的内存之外? Can it harm XCode?它会损害 XCode 吗?
  4. Any recommendations on how to run my inherently buggy code safely?关于如何安全地运行我固有的错误代码的任何建议?

I use OSX 10.7, Xcode 4.6.我使用 OSX 10.7、Xcode 4.6。

As far as the ISO C standard (the official definition of the language) is concerned, accessing an array outside its bounds has " undefined behavior ".就 ISO C 标准(该语言的官方定义)而言,访问超出其边界的数组具有“未定义行为”。 The literal meaning of this is:这句话的字面意思是:

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements在使用不可移植或错误的程序结构或错误数据时的行为,本国际标准对此没有强加要求

A non-normative note expands on this:非规范性说明对此进行了扩展:

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).可能的未定义行为范围从完全忽略情况并产生不可预测的结果,在翻译或程序执行期间以环境特征的文件化方式(有或没有发出诊断消息),到终止翻译或执行(发出诊断消息)。

So that's the theory.这就是理论。 What's the reality?真相是什么?

In the "best" case, you'll access some piece of memory that's either owned by your currently running program (which might cause your program to misbehave), or that's not owned by your currently running program (which will probably cause your program to crash with something like a segmentation fault).在“最好”的情况下,您将访问一些属于您当前正在运行的程序(这可能会导致您的程序行为异常)或属于您当前正在运行的程序(这可能会导致您的程序崩溃之类的分段错误)。 Or you might attempt to write to memory that your program owns, but that's marked read-only;或者您可能会尝试写入您的程序拥有的内存,但它被标记为只读; this will probably also cause your program to crash.这也可能会导致您的程序崩溃。

That's assuming your program is running under an operating system that attempts to protect concurrently running processes from each other.那是假设您的程序在一个操作系统下运行,该操作系统试图相互保护并发运行的进程。 If your code is running on the "bare metal", say if it's part of an OS kernel or an embedded system, then there is no such protection;如果您的代码在“裸机”上运行,比如说它是操作系统内核或嵌入式系统的一部分,那么就没有这样的保护; your misbehaving code is what was supposed to provide that protection.您行为不端的代码应该提供这种保护。 In that case, the possibilities for damage are considerably greater, including, in some cases, physical damage to the hardware (or to things or people nearby).在这种情况下,损坏的可能性要大得多,包括在某些情况下对硬件(或附近的东西或人)造成物理损坏。

Even in a protected OS environment, the protections aren't always 100%.即使在受保护的操作系统环境中,保护也不总是 100%。 There are operating system bugs that permit unprivileged programs to obtain root (administrative) access, for example.例如,存在允许非特权程序获得 root(管理)访问权限的操作系统错误。 Even with ordinary user privileges, a malfunctioning program can consume excessive resources (CPU, memory, disk), possibly bringing down the entire system.即使具有普通用户权限,出现故障的程序也会消耗过多资源(CPU、内存、磁盘),可能会导致整个系统瘫痪。 A lot of malware (viruses, etc.) exploits buffer overruns to gain unauthorized access to the system.许多恶意软件(病毒等)利用缓冲区溢出来获得对系统的未经授权的访问。

(One historical example: I've heard that on some old systems with core memory , repeatedly accessing a single memory location in a tight loop could literally cause that chunk of memory to melt. Other possibilities include destroying a CRT display, and moving the read/write head of a disk drive with the harmonic frequency of the drive cabinet, causing it to walk across a table and fall onto the floor.) (一个历史例子:我听说在一些具有核心内存的旧系统上,在紧密循环中重复访问单个内存位置可能会导致那块内存融化。其他可能性包括破坏 CRT 显示器,并移动读取/用驱动器柜的谐波频率写入磁盘驱动器的磁头,使其穿过桌子并掉到地板上。)

And there's always Skynet to worry about.而且总是有天网需要担心。

The bottom line is this: if you could write a program to do something bad deliberately , it's at least theoretically possible that a buggy program could do the same thing accidentally .底线是这样的:如果你可以编写一个程序来故意做坏事,那么至少在理论上,一个有缺陷的程序可能会意外地做同样的事情。

In practice, it's very unlikely that your buggy program running on a MacOS X system is going to do anything more serious than crash.在实践中,这是非常不可能的,你与MacOS X系统上运行错误的程序会做什么比暴跌更加严重。 But it's not possible to completely prevent buggy code from doing really bad things.但是不可能完全阻止错误代码做非常糟糕的事情。

In general, Operating Systems of today (the popular ones anyway) run all applications in protected memory regions using a virtual memory manager.通常,当今的操作系统(无论如何都是流行的)使用虚拟内存管理器在受保护的内存区域中运行所有应用程序。 It turns out that it is not terribly EASY (per se) to simply read or write to a location that exists in REAL space outside the region(s) that have been assigned / allocated to your process.事实证明,简单地读取或写入存在于已分配/分配给您的进程的区域之外的真实空间中的位置并不容易(本身)。

Direct answers:直接回答:

  1. Reading will almost never directly damage another process, however it can indirectly damage a process if you happen to read a KEY value used to encrypt, decrypt, or validate a program / process.读取几乎永远不会直接损坏另一个进程,但是如果您碰巧读取了用于加密、解密或验证程序/进程的 KEY 值,它可能会间接损坏一个进程。 Reading out of bounds can have somewhat adverse / unexpected affects on your code if you are making decisions based on the data you are reading如果您根据正在阅读的数据做出决策,越界读取可能会对您的代码产生一些不利/意外的影响

  2. The only way your could really DAMAGE something by writing to a loaction accessible by a memory address is if that memory address that you are writing to is actually a hardware register (a location that actually is not for data storage but for controlling some piece of hardware) not a RAM location.通过写入内存地址可访问的位置来真正损坏某些东西的唯一方法是,如果您写入的内存地址实际上是硬件寄存器(实际上不是用于数据存储而是用于控制某些硬件的位置) ) 不是 RAM 位置。 In all fact, you still wont normally damage something unless you are writing some one time programmable location that is not re-writable (or something of that nature).事实上,除非您正在编写一些不可重写的一次性可编程位置(或具有这种性质的东西),否则您通常仍然不会损坏某些东西。

  3. Generally running from within the debugger runs the code in debug mode.通常从调试器内部运行以调试模式运行代码。 Running in debug mode does TEND to (but not always) stop your code faster when you have done something considered out of practice or downright illegal.当您做了一些被认为不符合实践或完全非法的事情时,在调试模式下运行会(但并不总是)更快地停止您的代码。

  4. Never use macros, use data structures that already have array index bounds checking built in, etc....永远不要使用宏,使用已经内置数组索引边界检查的数据结构等......

ADDITIONAL I should add that the above information is really only for systems using an operating system with memory protection windows.另外我应该补充一点,上述信息实际上仅适用于使用具有内存保护窗口的操作系统的系统。 If writing code for an embedded system or even a system utilizing an operating system (real-time or other) that does not have memory protection windows (or virtual addressed windows) that one should practice a lot more caution in reading and writing to memory.如果为嵌入式系统或什至是使用没有内存保护窗口(或虚拟寻址窗口)的操作系统(实时或其他)编写代码,则在读取和写入内存时应更加谨慎。 Also in these cases SAFE and SECURE coding practices should always be employed to avoid security issues.同样在这些情况下,应始终采用 SAFE 和 SECURE 编码实践来避免安全问题。

Not checking bounds can lead to to ugly side effects, including security holes.不检查边界会导致丑陋的副作用,包括安全漏洞。 One of the ugly ones is arbitrary code execution .丑陋之一是任意代码执行 In classical example: if you have an fixed size array, and use strcpy() to put a user-supplied string there, the user can give you a string that overflows the buffer and overwrites other memory locations, including code address where CPU should return when your function finishes.在经典示例中:如果您有一个固定大小的数组,并使用strcpy()将用户提供的字符串放在那里,则用户可以给您一个溢出缓冲区并覆盖其他内存位置的字符串,包括 CPU 应返回的代码地址当您的功能完成时。

Which means your user can send you a string that will cause your program to essentially call exec("/bin/sh") , which will turn it into shell, executing anything he wants on your system, including harvesting all your data and turning your machine into botnet node.这意味着您的用户可以向您发送一个字符串,该字符串将导致您的程序本质上调用exec("/bin/sh") ,这会将其转换为 shell,在您的系统上执行他想要的任何内容,包括收集您的所有数据并将您的机器进入僵尸网络节点。

See Smashing The Stack For Fun And Profit for details on how this can be done.有关如何做到这一点的详细信息,请参阅粉碎堆栈以获得乐趣和利润

You write:你写:

I read a lot of 'anything can happen', 'segmentation might be the least bad problem', 'your harddisk might turn pink and unicorns might be singing under your window', which is all nice, but what is really the danger?我读了很多“任何事情都可能发生”、“分段可能是最不坏的问题”、“你的硬盘可能会变成粉红色,独角兽可能会在你的窗户下唱歌”,这些都很好,但真正的危险是什么?

Lets put it that way: load a gun.这么说吧:给枪上膛。 Point it outside the window without any particular aim and fire.将它指向窗外,没有任何特定的目标和射击。 What is the danger?有什么危险?

The issue is that you do not know.问题是你不知道。 If your code overwrites something that crashes your program you are fine because it will stop it into a defined state.如果您的代码覆盖了使您的程序崩溃的某些内容,那您就没事了,因为它会阻止它进入定义的状态。 However if it does not crash then the issues start to arise.但是,如果它没有崩溃,那么问题就会开始出现。 Which resources are under control of your program and what might it do to them?哪些资源受您的程序控制,它会对它们做什么? I know at least one major issue that was caused by such an overflow.我知道至少有一个主要问题是由这种溢出引起的。 The issue was in a seemingly meaningless statistics function that messed up some unrelated conversion table for a production database.问题出在一个看似毫无意义的统计函数中,它弄乱了生产数据库的一些不相关的转换表。 The result was some very expensive cleanup afterwards.结果是之后进行了一些非常昂贵的清理工作。 Actually it would have been much cheaper and easier to handle if this issue would have formatted the hard disks ... with other words: pink unicorns might be your least problem.实际上,如果这个问题会格式化硬盘,它会更便宜和更容易处理......换句话说:粉红色独角兽可能是你最不担心的问题。

The idea that your operating system will protect you is optimistic.您的操作系统将保护您的想法是乐观的。 If possible try to avoid writing out of bounds.如果可能,尽量避免写越界。

Not running your program as root or any other privileged user won't harm any of your system, so generally this might be a good idea.不以 root 或任何其他特权用户身份运行您的程序不会损害您的任何系统,因此通常这可能是一个好主意。

By writing data to some random memory location you won't directly "damage" any other program running on your computer as each process runs in it's own memory space.通过将数据写入某个随机内存位置,您不会直接“损坏”计算机上运行的任何其他程序,因为每个进程都在其自己的内存空间中运行。

If you try to access any memory not allocated to your process the operating system will stop your program from executing with a segmentation fault.如果您尝试访问任何未分配给您的进程的内存,操作系统将因分段错误而阻止您的程序执行。

So directly (without running as root and directly accessing files like /dev/mem) there is no danger that your program will interfere with any other program running on your operating system.因此,直接(无需以 root 身份运行并直接访问诸如 /dev/mem 之类的文件),您的程序不会干扰在您的操作系统上运行的任何其他程序。

Nevertheless - and probably this is what you have heard about in terms of danger - by blindly writing random data to random memory locations by accident you sure can damage anything you are able to damage.尽管如此 - 可能这就是你听说过的危险 - 通过盲目地将随机数据意外写入随机内存位置,你肯定会损坏任何你能够损坏的东西。

For example your program might want to delete a specific file given by a file name stored somewhere in your program.例如,您的程序可能想要删除由存储在程序中某处的文件名给出的特定文件。 If by accident you just overwrite the location where the file name is stored you might delete a very different file instead.如果您不小心覆盖了存储文件名的位置,您可能会删除一个非常不同的文件。

NSArray s in Objective-C are assigned a specific block of memory. Objective-C 中的NSArray被分配了一个特定的内存块。 Exceeding the bounds of the array means that you would be accessing memory that is not assigned to the array.超出数组的边界意味着您将访问未分配给数组的内存。 This means:这意味着:

  1. This memory can have any value.这个内存可以有任何价值。 There's no way of knowing if the data is valid based on your data type.无法根据您的数据类型知道数据是否有效。
  2. This memory may contain sensitive information such as private keys or other user credentials.该内存可能包含敏感信息,例如私钥或其他用户凭据。
  3. The memory address may be invalid or protected.内存地址可能无效或受保护。
  4. The memory can have a changing value because it's being accessed by another program or thread.内存可以有一个变化的值,因为它正在被另一个程序或线程访问。
  5. Other things use memory address space, such as memory-mapped ports.其他东西使用内存地址空间,例如内存映射端口。
  6. Writing data to unknown memory address can crash your program, overwrite OS memory space, and generally cause the sun to implode.将数据写入未知内存地址会导致程序崩溃、覆盖操作系统内存空间,并且通常会导致太阳内爆。

From the aspect of your program you always want to know when your code is exceeding the bounds of an array.从程序的角度来看,您总是想知道您的代码何时超出了数组的边界。 This can lead to unknown values being returned, causing your application to crash or provide invalid data.这可能会导致返回未知值,从而导致您的应用程序崩溃或提供无效数据。

You may want to try using the memcheck tool in Valgrind when you test your code -- it won't catch individual array bounds violations within a stack frame, but it should catch many other sorts of memory problem, including ones that would cause subtle, wider problems outside the scope of a single function.在测试代​​码时,您可能想尝试使用Valgrind 中memcheck工具——它不会捕获堆栈帧内的单个数组边界违规,但它应该捕获许多其他类型的内存问题,包括会导致微妙的问题,单个功能范围之外的更广泛的问题。

From the manual:从手册:

Memcheck is a memory error detector. Memcheck 是一个内存错误检测器。 It can detect the following problems that are common in C and C++ programs.它可以检测以下 C 和 C++ 程序中常见的问题。

  • Accessing memory you shouldn't, eg overrunning and underrunning heap blocks, overrunning the top of the stack, and accessing memory after it has been freed.访问您不应该访问的内存,例如超限和超限运行堆块、超限堆栈顶部以及在释放内存后访问内存。
  • Using undefined values, ie values that have not been initialised, or that have been derived from other undefined values.使用未定义值,即尚未初始化的值,或从其他未定义值派生的值。
  • Incorrect freeing of heap memory, such as double-freeing heap blocks, or mismatched use of malloc/new/new[] versus free/delete/delete[]不正确地释放堆内存,例如双重释放堆块,或 malloc/new/new[] 与 free/delete/delete[] 的使用不匹配
  • Overlapping src and dst pointers in memcpy and related functions.在 memcpy 和相关函数中重叠 src 和 dst 指针。
  • Memory leaks.内存泄漏。

ETA: Though, as Kaz's answer says, it's not a panacea, and doesn't always give the most helpful output, especially when you're using exciting access patterns. ETA:不过,正如 Kaz 的回答所说,它不是灵丹妙药,并且并不总是提供最有用的输出,尤其是当您使用令人兴奋的访问模式时。

If you ever do systems level programming or embedded systems programming, very bad things can happen if you write to random memory locations.如果您曾经进行过系统级编程或嵌入式系统编程,那么如果您写入随机存储器位置,可能会发生非常糟糕的事情。 Older systems and many micro-controllers use memory mapped IO, so writing to a memory location that maps to a peripheral register can wreak havoc, especially if it is done asynchronously.较旧的系统和许多微控制器使用内存映射 IO,因此写入映射到外围寄存器的内存位置可能会造成严重破坏,尤其是在异步完成时。

An example is programming flash memory.一个例子是对闪存进行编程。 Programming mode on the memory chips is enabled by writing a specific sequence of values to specific locations inside the address range of the chip.通过将特定的值序列写入芯片地址范围内的特定位置,可以启用存储芯片上的编程模式。 If another process were to write to any other location in the chip while that was going on, it would cause the programming cycle to fail.如果另一个进程在此过程中写入芯片中的任何其他位置,则会导致编程周期失败。

In some cases the hardware will wrap addresses around (most significant bits/bytes of address are ignored) so writing to an address beyond the end of the physical address space will actually result in data being written right in the middle of things.在某些情况下,硬件会将地址环绕(忽略地址的最高有效位/字节),因此写入超出物理地址空间末尾的地址实际上会导致数据在中间写入。

And finally, older CPUs like the MC68000 can locked up to the point that only a hardware reset can get them going again.最后,像 MC68000 这样的旧 CPU 可以锁定到只有硬件复位才能使它们再次运行的程度。 Haven't worked on them for a couple of decades but I believe it's when it encountered a bus error (non-existent memory) while trying to handle an exception, it would simply halt until the hardware reset was asserted.已经有几十年没有使用它们了,但我相信当它在尝试处理异常时遇到总线错误(不存在的内存)时,它会简单地停止,直到硬件复位被断言。

My biggest recommendation is a blatant plug for a product, but I have no personal interest in it and I am not affiliated with them in any way - but based on a couple of decades of C programming and embedded systems where reliability was critical, Gimpel's PC Lint will not only detect those sort of errors, it will make a better C/C++ programmer out of you by constantly harping on you about bad habits.我最大的建议是公然的产品插件,但我对它没有个人兴趣,我与他们没有任何关系 - 但基于几十年的 C 编程和嵌入式系统,其中可靠性至关重要,Gimpel 的 PC Lint 不仅会检测这些类型的错误,还会通过不断提醒你的坏习惯使你成为一个更好的 C/C++ 程序员。

I'd also recommend reading the MISRA C coding standard, if you can snag a copy from someone.如果您可以从某人那里获取副本,我还建议您阅读 MISRA C 编码标准。 I haven't seen any recent ones but in ye olde days they gave a good explanation of why you should/shouldn't do the things they cover.我没有看到任何最近的,但在过去的日子里,他们很好地解释了为什么你应该/不应该做他们涵盖的事情。

Dunno about you, but about the 2nd or 3rd time I get a coredump or hangup from any application, my opinion of whatever company produced it goes down by half.不知道你,但是大约第二次或第三次我从任何应用程序中得到核心转储或挂断,我对任何公司生产它的看法都会下降一半。 The 4th or 5th time and whatever the package is becomes shelfware and I drive a wooden stake through the center of the package/disc it came in just to make sure it never comes back to haunt me.第 4 次或第 5 次,无论包裹是什么,都变成了货架,我将一根木桩穿过它进来的包裹/光盘的中心,以确保它永远不会回来困扰我。

I'm working with a compiler for a DSP chip which deliberately generates code that accesses one past the end of an array out of C code which does not!我正在使用一个 DSP 芯片的编译器,它故意生成代码,从 C 代码中访问一个数组末尾的代码,而没有!

This is because the loops are structured so that the end of an iteration prefetches some data for the next iteration.这是因为循环的结构使得迭代结束时会为下一次迭代预取一些数据。 So the datum prefetched at the end of the last iteration is never actually used.所以在最后一次迭代结束时预取的数据实际上从未被使用过。

Writing C code like that invokes undefined behavior, but that is only a formality from a standards document which concerns itself with maximal portability.编写这样的 C 代码会调用未定义的行为,但这只是标准文档中的一种形式,它关注最大的可移植性。

More often that not, a program which accesses out of bounds is not cleverly optimized.更常见的是,越界访问的程序没有被巧妙地优化。 It is simply buggy.这简直就是马车。 The code fetches some garbage value and, unlike the optimized loops of the aforementioned compiler, the code then uses the value in subsequent computations, thereby corrupting theim.代码获取一些垃圾值,并且与上述编译器的优化循环不同,代码然后在后续计算中使用该值,从而破坏它们。

It is worth catching bugs like that, and so it is worth making the behavior undefined for even just that reason alone: so that the run-time can produce a diagnostic message like "array overrun in line 42 of main.c".捕捉这样的错误是值得的,因此即使仅仅因为这个原因,也值得使行为未定义:以便运行时可以产生诊断消息,如“main.c 的第 42 行中的数组溢出”。

On systems with virtual memory, an array could happen to be allocated such that the address which follows is in an unmapped area of virtual memory.在具有虚拟内存的系统上,可能会碰巧分配一个数组,以便其后的地址位于虚拟内存的未映射区域中。 The access will then bomb the program.然后访问将轰炸程序。

As an aside, note that in C we are permitted to create a pointer which is one past the end of an array.顺便说一句,请注意,在 C 中,我们允许创建一个指针,该指针位于数组末尾。 And this pointer has to compare greater than any pointer to the interior of an array.这个指针必须比任何指向数组内部的指针都大。 This means that a C implementation cannot place an array right at the end of memory, where the one plus address would wrap around and look smaller than other addresses in the array.这意味着 C 实现不能将数组放在内存的末尾,在那里加一地址会环绕并且看起来比数组中的其他地址小。

Nevertheless, access to uninitialized or out of bounds values are sometimes a valid optimization technique, even if not maximally portable.然而,访问未初始化或越界值有时是一种有效的优化技术,即使不是最大程度的可移植。 This is for instance why the Valgrind tool does not report accesses to uninitialized data when those accesses happen, but only when the value is later used in some way that could affect the outcome of the program.例如,这就是为什么 Valgrind 工具在发生这些访问时不会报告对未初始化数据的访问,而是仅在以后以可能影响程序结果的某种方式使用该值时才报告。 You get a diagnostic like "conditional branch in xxx:nnn depends on uninitialized value" and it can be sometimes hard to track down where it originates.您会得到类似“xxx:nnn 中的条件分支取决于未初始化的值”之类的诊断信息,有时可能很难追踪它的来源。 If all such accesses were trapped immediately, there would be a lot of false positives arising from compiler optimized code as well as correctly hand-optimized code.如果所有此类访问都立即被捕获,那么编译器优化的代码以及正确手动优化的代码都会产生大量误报。

Speaking of which, I was working with some codec from a vendor which was giving off these errors when ported to Linux and run under Valgrind.说到这一点,我正在使用供应商的一些编解码器,当移植到 Linux 并在 Valgrind 下运行时,它会发出这些错误。 But the vendor convinced me that only several bits of the value being used actually came from uninitialized memory, and those bits were carefully avoided by the logic.. Only the good bits of the value were being used and Valgrind doesn't have the ability to track down to the individual bit.但是厂商使我确信,实际使用只有几个值的比特从未初始化的内存来了,那些位进行了仔细的逻辑避免..正在使用的只有价值的好位,Valgrind的不具备的能力追踪到单个位。 The uninitialized material came from reading a word past the end of a bit stream of encoded data, but the code knows how many bits are in the stream and will not use more bits than there actually are.未初始化的材料来自读取超过编码数据位流末尾的一个字,但代码知道流中有多少位,并且不会使用比实际更多的位。 Since the access beyond the end of the bit stream array does not cause any harm on the DSP architecture (there is no virtual memory after the array, no memory-mapped ports, and the address does not wrap) it is a valid optimization technique.由于超出位流数组末尾的访问不会对 DSP 架构造成任何损害(数组后没有虚拟内存,没有内存映射端口,地址不回绕),这是一种有效的优化技术。

"Undefined behavior" does not really mean much, because according to ISO C, simply including a header which is not defined in the C standard, or calling a function which is not defined in the program itself or the C standard, are examples of undefined behavior. “未定义行为”实际上并没有多大意义,因为根据 ISO C,简单地包含未在 C 标准中定义的头文件,或者调用未在程序本身或 C 标准中定义的函数,都是未定义的示例行为。 Undefined behavior doesn't mean "not defined by anyone on the planet" just "not defined by the ISO C standard".未定义的行为并不意味着“地球上的任何人都没有定义”,只是“ISO C 标准没有定义”。 But of course, sometimes undefined behavior really is absolutely not defined by anyone.但当然,有时未定义的行为真的绝对不会被任何人所定义。

除了你自己的程序,我认为你不会破坏任何东西,在最坏的情况下,你会尝试从与内核未分配给你的进程的页面相对应的内存地址读取或写入,产生正确的异常并被杀死(我的意思是,你的过程)。

Arrays with two or more dimensions pose a consideration beyond those mentioned in other answers.具有两个或更多维度的数组提出了超出其他答案中提到的考虑因素。 Consider the following functions:考虑以下函数:

char arr1[2][8];
char arr2[4];
int test1(int n)
{
  arr1[1][0] = 1;
  for (int i=0; i<n; i++) arr1[0][i] = arr2[i];      
  return arr1[1][0];
}
int test2(int ofs, int n)
{
  arr1[1][0] = 1;
  for (int i=0; i<n; i++) *(arr1[0]+i) = arr2[i];      
  return arr1[1][0];
}

The way gcc will processes the first function will not allow for the possibility that an attempt to write arr[0][i] might affect the value of arr[1][0], and the generated code is incapable of returning anything other than a hardcoded value of 1. Although the Standard defines the meaning of array[index] as precisely equivalent to (*((array)+(index))) , gcc seems to interpret the notion of array bounds and pointer decay differently in cases which involve using [] operator on values of array type, versus those which use explicit pointer arithmetic. gcc 处理第一个函数的方式不允许尝试写入 arr[0][i] 可能会影响 arr[1][0] 的值,并且生成的代码无法返回除硬编码值 1。尽管标准将array[index]的含义定义为与(*((array)+(index)))完全等效,但 gcc 似乎在以下情况下对数组边界和指针衰减的概念进行了不同的解释涉及对数组类型的值使用 [] 运算符,而不是使用显式指针算术的运算符。

I just want to add some practical examples to this questions - Imagine the following code:我只想为这个问题添加一些实际示例——想象一下以下代码:

#include <stdio.h>

int main(void) {
    int n[5];
    n[5] = 1;

    printf("answer %d\n", n[5]);

    return (0);
}

Which has Undefined Behaviour.其中有未定义的行为。 If you enable for example clang optimisations (-Ofast) it would result in something like :例如,如果您启用了 clang 优化 (-Ofast),它会产生如下结果:

answer 748418584

(Which if you compile without will probably output the correct result of answer 1 ) (如果你没有编译可能会输出answer 1的正确结果)

This is because in the first case the assignment to 1 is never actually assembled in the final code (you can look in the godbolt asm code as well).这是因为在第一种情况下,对 1 的赋值从未实际组装到最终代码中(您也可以查看 godbolt asm 代码)。

(However it must be noted that by that logic main should not even call printf so best advice is not to depend on the optimiser to solve your UB - but rather have the knowledge that sometimes it may work this way) (但是必须注意,根据该逻辑main甚至不应该调用printf所以最好的建议是不要依赖优化器来解决您的 UB - 而是要知道有时它可能会以这种方式工作)

The takeaway here is that modern C optimising compilers will assume undefined behaviour (UB) to never occur (which means the above code would be similar to something like ( but not the same ):这里的要点是现代 C 优化编译器将假定未定义行为 (UB) 永远不会发生(这意味着上面的代码将类似于(但不相同)):

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    int n[5];

    if (0)
        n[5] = 1;

    printf("answer %d\n", (exit(-1), n[5]));

    return (0);
} 

Which on contrary is perfectly defined).相反,这是完美定义的)。

That's because the first conditional statement never reaches it's true state ( 0 is always false).那是因为第一个条件语句永远不会达到它的真状态( 0总是假的)。

And on the second argument for printf we have a sequence point after which we call exit and the program terminates before invoking the UB in the second comma operator (so it's well defined).printf的第二个参数上,我们有一个序列点,之后我们调用exit并且程序在第二个逗号运算符中调用 UB 之前终止(因此定义明确)。

So the second takeaway is that UB is not UB as long as it's never actually evaluated.所以第二个要点是,只要从未实际评估过,UB 就不是 UB。

Additionally I don't see mentioned here there is fairly modern Undefined Behaviour sanitiser (at least on clang) which (with the option -fsanitize=undefined ) will give the following output on the first example (but not the second):此外,我没有看到这里提到有相当现代的未定义行为消毒器(至少在 clang 上),它(带有选项-fsanitize=undefined )将在第一个示例(但不是第二个)中给出以下输出:

/app/example.c:5:5: runtime error: index 5 out of bounds for type 'int[5]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /app/example.c:5:5 in 
/app/example.c:7:27: runtime error: index 5 out of bounds for type 'int[5]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /app/example.c:7:27 in 

Here is all the samples in godbolt:以下是 Godbolt 中的所有示例:

https://godbolt.org/z/eY9ja4fdh (first example and no flags) https://godbolt.org/z/eY9ja4fdh (第一个例子,没有标志)

https://godbolt.org/z/cGcY7Ta9M (first example and -Ofast clang) https://godbolt.org/z/cGcY7Ta9M (第一个例子和-Ofast clang)

https://godbolt.org/z/cGcY7Ta9M (second example and UB sanitiser on) https://godbolt.org/z/cGcY7Ta9M (第二个例子和 UB sanitiser on)

https://godbolt.org/z/vE531EKo4 (first example and UB sanitiser on) https://godbolt.org/z/vE531EKo4 (第一个示例和 UB 消毒器)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM