简体   繁体   English

为什么下面代码中的printf语句打印的是值而不是垃圾值?

[英]Why is the printf statement in the code below printing a value rather than a garbage value?

int main(){
    int array[] = [10,20,30,40,50] ;
    printf("%d\n",-2[array -2]);
    return 0 ;
}

Can anyone explain how -2[array-2] is working and Why are [ ] used here? 任何人都可以解释-2 [array-2]是如何工作的,为什么[]在这里使用? This was a question in my assignment it gives the output " -10 " but I don't understand why? 这是我的任务中的一个问题,它给输出“-10”,但我不明白为什么?

Technically speaking, this invokes undefined behaviour. 从技术上讲,这会调用未定义的行为。 Quoting C11 , chapter §6.5.6 引用C11 ,章节§6.5.6

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; 如果指针操作数和结果都指向同一个数组对象的元素,或者指向数组对象的最后一个元素,则评估不应产生溢出; otherwise, the behavior is undefined. 否则,行为未定义。 [....] [....]

So, (array-2) is undefined behavior. 所以, (array-2)是未定义的行为。

However, most compilers will read the indexing, and it will likely be able to nullify the +2 and -2 indexing, [ 2[a] is same as a[2] which is same as *(a+2) , thus, 2[a-2] is *((2)+(a-2)) ], and only consider the remaining expression to be evaluated, which is *(a) or, a[0] . 但是,大多数编译器都会读取索引,并且它很可能能够使+2-2索引无效,[ 2[a]a[2]相同,与*(a+2) ,因此, 2[a-2]*((2)+(a-2)) ],并且仅考虑要评估的剩余表达式,即*(a)或, a[0]

Then, check the operator precedence 然后,检查运算符优先级

-2[array -2] is effectively the same as -(array[0]) . -2[array -2]实际上与-(array[0]) So, the result is the value array[0] , and - ved. 因此,结果是值array[0]- ved。

This is an unfortunate example for instruction, because it implies it's okay to do some incorrect things that often work in practice. 这是教学的一个不幸的例子,因为它意味着做一些经常在实践中工作的不正确的事情是可以的。

The technically correct answer is that the program has Undefined Behavior, so any result is possible, including printing -10, printing a different number, printing something different or nothing at all, failing to run, crashing, and/or doing something entirely unrelated. 技术上正确的答案是程序具有未定义的行为,因此任何结果都是可能的,包括打印-10,打印不同的数字,打印不同的东西或根本不打印,无法运行,崩溃和/或做一些完全不相关的事情。

The undefined behavior comes up from evaluating the subexpression array -2 . 未定义的行为来自于评估子表达式array -2 array decays from its array type to a pointer to the first element. array从其数组类型衰减到指向第一个元素的指针。 array -2 would point at the element which comes two positions before that, but there is no such element (and it's not the "one-past-the-end" special rule), so evaluating that is a problem no matter what context it appears in. array -2将指向前面两个位置的元素,但是没有这样的元素(并且它不是“一个接一个”的特殊规则),因此无论在什么上下文中,它都是一个问题。出现在。

(C11 6.5.6/8 says) (C11 6.5.6 / 8说)

When an expression that has integer type is added to or subtracted from a pointer, .... If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; 当一个具有整数类型的表达式被添加到指针或从指针中减去时,....如果指针操作数和结果都指向同一个数组对象的元素,或者指向数组对象的最后一个元素,则评估不得产生溢出; otherwise, the behavior is undefined. 否则,行为未定义。


Now the technically incorrect answer the instructor is probably looking for is what actually happens on most implementations: 现在教师可能正在寻找的技术上不正确的答案是大多数实现中实际发生的事情:

Even though array -2 is outside the actual array, it evaluates to some address which is 2*sizeof(int) bytes before the address where the array's data starts. 即使array -2在实际数组之外,它也会计算到某个地址,该地址是数组数据开始的地址之前的2*sizeof(int)字节。 It's invalid to dereference that address since we don't know that there actually is any int there, but we're not going to. 取消引用该地址是无效的,因为我们不知道那里确实存在任何int ,但我们不会这样做。

Looking at the larger expression -2[array -2] , the [] operator has higher precedence than the unary - operator, so it means -(2[array -2]) and not (-2)[array -2] . 综观更大的表达式-2[array -2][]操作者具有比一元更高的优先级-操作者,所以这意味着-(2[array -2])和不(-2)[array -2] A[B] is defined to mean the same as *((A)+(B)) . A[B]定义为与*((A)+(B)) It's customary to have A be a pointer value and B be an integer value, but it's also legal to use them reversed like we're doing here. 习惯上A是一个指针值而B是一个整数值,但是像我们在这里一样使用它们也是合法的。 So these are equivalent: 所以这些是等价的:

-2[array -2]
-(2[array -2])
-(*(2 + (array - 2)))
-(*(array))

The last step acts like we would expect: Adding two to the address value of array - 2 is 2*sizeof(int) bytes after that value, which gets us back to the address of the first array element. 最后一步的行为与我们期望的一样:向array - 2的地址值添加两个array - 2是该值之后的2*sizeof(int)字节,这使我们返回到第一个数组元素的地址。 So *(array) dereferences that address, giving 10, and -(*(array)) negates that value, giving -10. 因此*(array)解引用地址,给出10和-(*(array))否定该值,给出-10。 The program prints -10. 该程序打印-10。


You should never count on things like this, even if you observe it "works" on your system and compiler. 你应该永远不要指望这样的事情,即使你观察它在你的系统和编译器上“有效”。 Since the language guarantees nothing about what will happen, the code might not work if you make slight changes which seem they shouldn't be related, or on a different system, a different compiler, a different version of the same compiler, or using the same system and compiler on a different day. 由于语言不保证会发生什么,如果你做了一些似乎不应该相关的细微更改,或者在不同的系统,不同的编译器,同一编译器的不同版本或使用同一系统和编译器在不同的一天。

Here is how -2[array-2] is evaluated: 以下是-2[array-2]的评估方式:

First, note that -2[array-2] is parsed as - (2[array-2]) . 首先,请注意-2[array-2]被解析为- (2[array-2]) The subscript operator, [...] has higher precedence than the unary - operator. 下标运算符, [...]具有比一元更高的优先级-运营商。 We often think of constants like -2 as single numbers, but it is in fact a - operator applied to a 2 . 我们经常认为像-2这样的常数是单个数字,但它实际上是一个-运算符应用于2

In array-2 , array is automatically converted to a pointer to its first element, so it points to array[0] . array-2array自动转换为指向其第一个元素的指针,因此它指向array[0]

Then array-2 attempts to calculate a pointer to two elements before the first element of the array. 然后, array-2尝试在数组的第一个元素之前计算指向两个元素的指针。 The resulting behavior is not defined by the C standard because C 2018 6.5.6 8 says that only arithmetic that points to array members and the end of the array is defined. 结果行为不是由C标准定义的,因为C 2018 6.5.6 8表示只定义了指向数组成员和数组末尾的算术。

For illustration only, suppose we are using a C implementation that extends the C standard by defining pointers to use a flat address space and permit arbitrary pointer arithmetic. 仅用于说明,假设我们使用扩展C标准的C实现,方法是定义指针​​以使用平面地址空间并允许任意指针算术。 Then array-2 points two elements before the array. 然后array-2指向数组之前的两个元素。

Then 2[array-2] uses the fact that the C standard defines E1[E2] to be *((E1)+(E2)) . 然后2[array-2]使用C标准将E1[E2]定义为*((E1)+(E2))的事实。 That is, the subscript operator is implemented by adding the two things and applying * . 也就是说,通过添加两个东西并应用*来实现下标运算符。 Thus, it does not matter which expression is E1 and which is E2 . 因此,哪个表达是E1 ,哪个是E2无关紧要。 E1+E2 is the same as E2+E1 . E1+E2E2+E1相同。 So 2[array-2] is *(2 + (array-2)) . 所以2[array-2]*(2 + (array-2)) Adding 2 moves the pointer from two elements before the array back to the start of the array. 添加2将指针从数组之前的两个元素移回数组的开头。 Then applying * produces the element at that location, which is 10. 然后应用*在该位置生成元素,即10。

Finally, applying - gives −10. 最后,申请-给-10。 (Recall that this conclusion is only achieved using our supposition that the C implementation supports a flat address space. You cannot use this in general C code.) (回想一下,只有使用C实现支持平面地址空间的假设才能得出这个结论。你不能在一般的C代码中使用它。)

This code invokes undefined behavior and can print anything, including -10 . 此代码调用未定义的行为并可以打印任何内容,包括-10

C17 6.5.2.1 Array subscripting states: C17 6.5.2.1数组下标状态:

The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))) 下标operator []的定义是E1[E2](*((E1)+(E2)))

Meaning array[n] is equivalent to *((array) + (n)) and that's how the compiler evaluates subscripting. 含义array[n]等价于*((array) + (n)) ,这就是编译器如何评估下标。 This allows us to write silly obfuscation like n[array] as 100% equivalent to array[n] . 这允许我们像n[array]一样写出愚蠢的混淆,与array[n]相当100%。 Because *((n) + (array)) is equivalent to *((array) + (n)) . 因为*((n) + (array))等价于*((array) + (n)) As explained here: 如下所述:
With arrays, why is it the case that a[5] == 5[a]? 对于数组,为什么a [5] == 5 [a]?

Looking at the expression -2[array -2] specifically: 具体看表达式-2[array -2]

  • [array -2] and [array - 2] are naturally equivalent. [array -2][array - 2]自然是等价的。 In this case the former is just sloppy style purposely used for the sake of obfuscating the code. 在这种情况下,前者只是为了混淆代码而故意使用的草率样式。
  • Operator precedence tells us to first consider [] . 运算符优先级告诉我们首先考虑[]
  • Thus the expression is equivalent to -*( (2) + (array - 2) ) 因此表达式相当于-*( (2) + (array - 2) )
  • Note that the first - is not part of the integer constant 2 . 请注意,第一个-不是整数常量2 C does not support negative integer constants 1) , the - is actually the unary minus operator. C不支持负整数常量1)-实际上是一元减运算符。
  • Unary minus has lower presedence than [] , so the 2 in -2[ "binds" to the [ . 一元减号的优先级低于[] ,所以2 in -2[ “绑定”到[
  • The sub-expression (array - 2) is evaluated individually and invokes undefined behavior, as per C17 6.5.6/8: 根据C17 6.5.6 / 8,单独计算子表达式(array - 2)并调用未定义的行为:

    When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. 当一个具有整数类型的表达式被添加到指针或从指针中减去时,结果具有指针操作数的类型。 /--/ If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; / - /如果指针操作数和结果都指向同一个数组对象的元素,或者指向数组对象的最后一个元素,则评估不应产生溢出; otherwise, the behavior is undefined. 否则,行为未定义。

  • Speculatively, one potential form of undefined behavior could be that a compiler decides to replace the whole expression (2) + (array - 2) with array , in which case the whole expression would end up as -*array and prints -10 . 推测性地,未定义行为的一种潜在形式可能是编译器决定用array替换整个表达式(2) + (array - 2) ,在这种情况下,整个表达式最终将作为-*array并打印-10

    There's no guarantees of this and therefore the code is bad. 没有保证,因此代码很糟糕。 If you were given the assignment to explain why the code prints -10 , your teacher is incompetent. 如果你被赋予了解释为什么代码打印-10 ,你的老师是无能的。 Not only is it meaningless/harmful to study obfuscation as part of C studies, it is harmful to rely on undefined behavior or expect it to give a certain result. 作为C研究的一部分,研究混淆不仅无意义/有害,依赖未定义的行为或期望它给出某种结果是有害的。


1) C rather supports negative integer constant expressions . 1) C支持负整数常量表达式 -2 is an integer constant expression, where 2 is an integer constant of type int . -2是整数常量表达式,其中2int类型的整数常量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM