简体   繁体   English

C和数组范围内的指针算术

[英]Pointer arithmetic in c and array bounds

I was browsing through a webpage which had some c FAQ's, I found this statement made. 我正在浏览一个包含一些c常见问题解答的网页 ,发现这句话很对。

Similarly, if a has 10 elements and ip points to a[3], you can't compute or access ip + 10 or ip - 5. (There is one special case: you can, in this case, compute, but not access, a pointer to the nonexistent element just beyond the end of the array, which in this case is &a[10]. 同样,如果a有10个元素,并且ip指向a [3], 则无法计算或访问ip + 10或ip-5。 (有一种特殊情况:在这种情况下,您可以计算但不能访问,指向数组末尾不存在元素的指针,在本例中为&a [10]。

I was confused by the statement 我对声明感到困惑

you can't compute ip + 10 你不能计算ip + 10

I can understand accessing the element out of bounds is undefined, but computing!!!. 我可以理解,对元素的访问是不确定的,但是计算!!!

I wrote the following snippet which computes (let me know if this is what the website meant by computing ) a pointer out-of-bounds. 我写了下面的代码片段,其计算 (让我知道,如果这是通过计算意味着什么网站)的指针出界外。

#include <stdio.h>                                                                                                                                                                  

int main()                                                                                                                                                                          
{                                                                                                                                                                                   
        int a[10], i;                                                                                                                                                               
        int *p;                                                                                                                                                                     

        for (i = 0; i<10; i++)                                                                                                                                                      
                a[i] = i;                                                                                                                                                           

        p = &a[3];                                                                                                                                                                  

        printf("p = %p and p+10 = %p\n", p, p+10);                                                                                                                                  
        return 0;                                                                                                                                                                   
}                     

$ ./a.out                                                                                                                                     
p = 0xbfa53bbc and p+10 = 0xbfa53be4     

We can see that p + 10 is pointing to 10 elements(40 bytes) past p. 我们可以看到p + 10指向p之后的10个元素(40字节)。 So what exactly does the statement made in the webpage mean. 那么,该网页中的陈述到底意味着什么。 Did I mis-interpret something. 我误解了吗?

Even in K&R (A.7.7) this statement is made: 即使在K&R(A.7.7)中,也可以这样声明:

The result of the + operator is the sum of the operands. +运算符的结果是操作数之和。 A pointer to an object in an array and a value of any integral type may be added. 可以添加指向数组中对象的指针和任何整数类型的值。 ... The sum is a pointer of the same type as the original pointer, and points to another object in the same array, appropriately offset from the original object. ... sum是与原始指针相同类型的指针,并指向同一数组中的另一个对象,该对象与原始对象有适当的偏移量。 Thus if P is a pointer to an object in an array, the expression P+1 is a pointer to the next object in the array. 因此,如果P是指向数组中对象的指针,则表达式P + 1是指向数组中下一个对象的指针。 If the sum pointer points outside the bounds of the array, except at the first location beyond the high end, the result is undefined. 如果总和指针指向数组的边界之外(高端以外的第一个位置除外),则结果不确定。

What does being "undefined" mean. “未定义”是什么意思。 Does this mean the sum will be undefined, or does it only mean when we dereference it the behavior is undefined. 这是否意味着总和是不确定的,还是仅当我们取消引用该行为时才是不确定的。 Is the operation undefined even when we do not dereference it and just calculate the pointer to element out-of-bounds. 即使我们不取消引用它,而只是计算指向元素超出范围的指针,该操作是否也是未定义的?

Undefined behavior means exactly that: absolutely anything could happen . 未定义的行为恰好意味着: 绝对有可能发生 It could succeed silently, it could fail silently, it could crash your program, it could blue screen your OS, or it could erase your hard drive. 它可能会默默地成功,它可能会默默地失败,它可能会使您的程序崩溃,可能会使您的操作系统蓝屏或可能会擦除您的硬盘驱动器。 Some of these are not very likely, but all of them are permissible behaviors as far as the C language standard is concerned . 其中一些不太可能,但是就C语言标准而言 ,它们都是允许的行为。

In this particular case, yes, the C standard is saying that even computing the address of a pointer outside of valid array bounds, without dereferencing it, is undefined behavior. 在这种特殊情况下,是的,C标准说的是,即使不取消引用的情况下计算有效数组范围之外的指针地址也是未定义的行为。 The reason it says this is that there are some arcane systems where doing such a calculation could result in a fault of some sort. 它之所以这样说,是因为在某些奥术系统中进行这样的计算可能会导致某种故障。 For example, you might have an array at the very end of addressable memory, and constructing a pointer beyond that would cause an overflow in a special address register which generates a trap or fault. 例如,您可能在可寻址内存的末尾有一个数组,并且构造一个超出该范围的指针将导致特殊地址寄存器中的溢出,从而产生陷阱或错误。 The C standard wants to permit this behavior in order to be as portable as possible. C标准希望允许这种行为以便尽可能地可移植。

In reality, though, you'll find that constructing such an invalid address without dereferencing it has well-defined behavior on the vast majority of systems you'll come across in common usage. 但是,实际上,您会发现在不常用的情况下构造这样的无效地址在您经常会遇到的绝大多数系统上都有明确定义的行为。 Creating an invalid memory address will have no ill effects unless you attempt to dereference it. 除非尝试取消引用,否则创建无效的内存地址不会有任何不良影响。 But of course, it's better to avoid creating those invalid addresses so that your code will work perfectly even on those arcane systems. 但是,当然,最好避免创建那些无效的地址,以便您的代码即使在那些奥秘的系统上也能正常工作。

The web page wording is confusing, but technically correct. 网页的措辞令人困惑,但在技术上是正确的。 The C99 language specification (section 6.5.6) discusses additive expressions, including pointer arithmetic. C99语言规范(第6.5.6节)讨论了加法表达式,包括指针算法。 Subitem 8 specifically states that computing a pointer one past the end of an array shall not cause an overflow, but beyond that the behavior is undefined. 子项8特别指出,在数组末尾计算一个指针不会导致溢出,但除此之外,行为是不确定的。

In a more practical sense, C compilers will generally let you get away with it, but what you do with the resulting value is up to you. 从更实际的意义上讲,C编译器通常会让您摆脱它,但是如何处理结果值则取决于您自己。 If you try to dereference the resulting pointer to a value, as K&R states, the behavior is undefined. 如果您尝试取消对结果指针的引用,如K&R所述,则行为未定义。

Undefined, in programming terms, means "Don't do that." 用编程术语来说,未定义表示“不要那样做”。 Basically, it means the specification that defines how the language works does not define an appropriate behavior in that situation. 基本上,这意味着定义语言工作方式的规范并未定义这种情况下的适当行为。 As a result, theoretically anything can happen. 结果,理论上任何事情都可能发生。 Generally all that happens is you have a silent or noisy (segfault) bug in your program, but many programmers like to joke about other possible results from causing undefined behavior, like deleting all of your files. 通常,发生的所有事情都是您的程序中有一个无声或嘈杂的(segfault)错误,但是许多程序员喜欢开玩笑讲引起未定义行为的其他可能结果,例如删除所有文件。

The behaviour would be undefined in the following case 在以下情况下,行为将是不确定的

int a[3];
(a + 10) ; // this is UB too as you are computing &a[10]
*(a+10) = 10; // Ewwww!!!!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM