简体   繁体   English

这不应该给出一个越界警告吗?

[英]Shouldn't this give an out-of-bounds warning?

I think this code should warn about an out-of-bounds array access: 我认为这段代码应该警告一个超出范围的数组访问:

int foo() {
  int x[10] = {0};
  int *p = &x[5];
  return p[~0LLU];
}

I know out-of-bounds warnings are not required by the standard but compilers do give them. 我知道标准不要求越界警告,但编译器会给出它们。 I'm asking whether it would be correct for the compiler to give such a warning here. 我问的是编译器在这里发出这样的警告是否正确。

Any reason why that code should be consider well formed? 是什么原因应该考虑良好的代码?

I think this code should warn about an out-of-bounds array access: 我认为这段代码应该警告一个超出范围的数组访问:

A decent compiler could warn you when you're doing that on non-VLA arrays (gcc does not, but clang does: https://godbolt.org/z/lOvl5n ) 一个体面的编译器可以在你在非VLA 阵列上做这件事时警告你(gcc没有,但clang做了: https//godbolt.org/z/lOvl5n

For this snippet: 对于此片段:

int foo() {
  int x[10] = {0};  
  return x[~0LLU];  // or x[40] to make it simpler, same thing
}

warning: 警告:

<source>:3:10: warning: array index -1 is past the end of the array (which contains 10 elements) [-Warray-bounds]

  return x[~0LLU];

         ^ ~~~~~

The compiler knows that this is an array, knows the size and therefore can check bounds if everything is literal (non-VLA array and literal index is the prerequesite) 编译器知道这是一个数组,知道大小,因此如果一切都是文字的话可以检查边界(非VLA数组和文字索引是先决条件)

In your case, what "loses" the compiler is that you're assigning to a pointer (array decays into a pointer) 在你的情况下,“丢失”编译器的是你指定一个指针(数组衰减成一个指针)

After that, the compiler isn't able to tell the origin of the data, so it cannot control the bounds (even if in your case, the offset is ludicriously big / negative / whatever). 之后,编译器无法告知数据的来源,因此它无法控制边界(即使在您的情况下,偏移量非常大/负/无论如何)。 A dedicated static analysis tool might find the issue. 专用的静态分析工具可能会发现问题。

The C language imposes no requirements on bounds checking of arrays. C语言对数组的边界检查没有要求。 That is part of what makes it fast. 这是使它快速发展的部分原因。 That being said, compilers can and do perform check in some situations. 话虽这么说,编译器可以并且确实在某些情况下执行检查。

For example, if I compile with -O3 in gcc and replace return p[~0LLU]; 例如,如果我在gcc中使用-O3进行编译并替换return p[~0LLU]; with return p[10]; return p[10]; I get the following warning: 我收到以下警告:

x1.c: In function ‘foo’:
x1.c:6:10: warning: ‘*((void *)&x+60)’ is used uninitialized in this function [-Wuninitialized]
   return p[10];

I get a similar warning if I use -10 as the index: 如果我使用-10作为索引,我会得到类似的警告:

gcc -g -O3 -Wall -Wextra -Warray-bounds -o x1 x1.c
x1.c: In function ‘foo’:
x1.c:6:10: warning: ‘*((void *)&x+-20)’ is used uninitialized in this function [-Wuninitialized]
   return p[-100];

So it does seem that it can warn about invalid negative values for an array index. 所以它似乎可以警告数组索引的无效负值。

In your case, it seems for this compiler that the value ~0LLU is converted to a signed value for the purposes of pointer arithmetic and is viewed as -1. 在您的情况下, 对于此编译器而言,为了指针算术的目的,值~0LLU被转换为有符号值,并被视为-1。

Note that this check can be fooled by putting other initialized variables around x : 请注意,可以通过在x周围放置其他初始化变量来欺骗此检查:

int foo() {
  int y[10] = {0};
  int x[10] = {0};
  int z[10] = {0};
  int *p = &x[5];
  printf("&x=%p, &y=%p, &z=%p\n", (void *)x, (void *)y, (void *)z);
  return p[10] + y[0] + z[0];
}

This code produces no warnings even though p[10] is out of bounds. 即使p[10]超出范围,此代码也不会产生警告。

So it's up to the implementation if it wants to perform a out-of-bounds check and how it does it. 因此,如果要执行越界检查以及如何执行,则由实现决定。

Edit: Complete rewrite, with standard quotes: 编辑:使用标准引号完成重写:

[dcl.array] [ Note: Except where it has been declared for a class , the subscript operator [] is interpreted in such a way that E1[E2] is identical to *((E1)+(E2)) [dcl.array] [注意:除了为类声明它之外 ,下标运算符[]的解释方式是E1[E2]*((E1)+(E2))

[expr.add] When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. [expr.add]当向指针添加或从指针中减去具有整数类型的表达式时,结果具有指针操作数的类型。 If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j ) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n ; 如果表达式P指向具有n元素的数组对象x元素x[i] ,则表达式P + JJ + P (其中J具有值j )指向(可能是假设的)元素x[i + j]如果0 ≤ i + j ≤ n ; otherwise, the behavior is undefined. 否则,行为未定义。

Therefore p[~0LLU] is interpreted identically to *(p + ~0LLU) (as per [dcl.array]) where the parenthesised expression points to the element x[5 + ~0LLU] - if the index is within the valid range - (as per [expr.add]). 因此p[~0LLU]的解释与*(p + ~0LLU) (根据[dcl.array]),其中带括号的表达式指向元素x[5 + ~0LLU] - 如果索引在有效范围内 - (根据[expr.add])。 If the index isn't within range, the behaviour is undefined. 如果索引不在范围内,则行为未定义。

Is 5 + ~0LLU within the valid range of indices though? 虽然在有效的指数范围内是5 + ~0LLU吗? Given integer conversion rules of the language, the shown expression would appear to be well-defined if the type of 5 were a signed type of no larger size than unsigned long long , and in that case the pointed element would be x[4] . 给定语言的整数转换规则,如果5的类型是不大于unsigned long long有符号类型,那么所示的表达式似乎是定义良好的,在这种情况下,指向的元素将是x[4] However, standard doesn't explicitly define the type of i and j in the expression that describes the behaviour. 但是,标准没有在描述行为的表达式中明确定义ij的类型。 It should be interpreted to be a pure mathematic expression in which case the result would be an index unrepresentable by long long unsigned and certainly greater than n and thus undefined behaviour. 它应该被解释为一个纯粹的数学表达式,在这种情况下,结果将是一个索引,由long long unsigned ,当然大于n ,因而是未定义的行为。

Given the interpretation that behaviour is undefined, it wouldn't be incorrect for the compiler to warn. 鉴于行为未定义的解释,编译器发出警告并不是错误的。 Regardless, the compiler is not required to warn. 无论如何,编译器不需要警告。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM