简体   繁体   English

指针到数组重叠的数组末尾

[英]Pointer-to-array overlapping end of array

Is this code correct? 这段代码是否正确?

int arr[2];

int (*ptr)[2] = (int (*)[2]) &arr[1];

ptr[0][0] = 0;

Obviously ptr[0][1] would be invalid by accessing out of bounds of arr . 显然,通过访问arr的界限, ptr[0][1]将无效。

Note: There's no doubt that ptr[0][0] designates the same memory location as arr[1] ; 注意:毫无疑问, ptr[0][0]指定的内存位置与arr[1] ; the question is whether we are allowed to access that memory location via ptr . 问题是我们是否被允许通过ptr访问该内存位置。 Here are some more examples of when an expression does designate the same memory location but it is not permitted to access the memory location that way. 下面是一些表达式确定指定相同内存位置但不允许以这种方式访问​​内存位置的示例。

Note 2: Also consider **ptr = 0; 注2:还要考虑**ptr = 0; . As pointed out by Marc van Leeuwen, ptr[0] is equivalent to *(ptr + 0) , however ptr + 0 seems to fall foul of the pointer arithmetic section. 正如Marc van Leeuwen所指出的, ptr[0]相当于*(ptr + 0) ,但是ptr + 0似乎与指针运算部分相悖。 But by using *ptr instead, that is avoided. 但是通过使用*ptr代替,可以避免这种情况。

Not an answer but a comment that I can't seem to word well without being a wall of text: 不是一个答案,而是一个评论,如果不是一个文本墙,我似乎无法说得好:

Given arrays are guaranteed to store their contents contiguously so that they can be 'iterated over' using a pointer. 给定数组保证连续存储它们的内容,以便可以使用指针“迭代”它们。 If I can take a pointer to the begin of an array and successively increment that pointer until I have accessed every element of the array then surely that makes a statement that the array can be accessed as a series of whatever type it is composed of. 如果我可以指向数组的开头并连续递增该指针,直到我访问了数组的每个元素,那么肯定会声明数组可以被访问为一系列由它组成的任何类型。

Surely the combination of: 1) Array[x] stores its first element at address 'array' 2) Successive increments of the a pointer to it are sufficient to access the next item 3) Array[x-1] obeys the same rules 当然结合:1)Array [x]将其第一个元素存储在地址'array'2)连续递增指向它的指针足以访问下一个项目3)Array [x-1]遵守相同的规则

Then it should be legal to at least look at the address 'array' as if it were type array[x-1] instead of type array[x]. 那么至少看一下地址'array'应该是合法的,好像它是类型array [x-1]而不是类型array [x]。

Furthermore given the points about being contiguous and how pointers to elements in the array have to behave, surely it must be legal to then group any contiguous subset of array[x] as array[y] where y < x and it's upper bound does not exceed the extent of array[x]. 此外,鉴于关于连续的要点以及指向数组中元素的指针必须如何表现,当然必须合法地将数组[x]的任何连续子集分组为数组[y],其中y <x并且它的上限不是超出数组[x]的范围。

Not being a language-lawyer this is just me spouting some rubbish. 不是语言律师,这只是我喷出一些垃圾。 I am very interested in the outcome of this discussion though. 我对这次讨论的结果非常感兴趣。

EDIT: 编辑:

On further consideration of the original code, it seems to me that arrays are themselves very much a special case in many regards. 在进一步考虑原始代码时,在我看来,在许多方面,数组本身就是一个特殊情况。 They decay to a pointer, and I believe can be aliased as per what I just said earlier in this post. 它们会衰减到一个指针,我相信可以按照我刚才在这篇文章中所说的那样混淆。

So without any standardese to back up my humble opinion, an array can't really be invalid or 'undefined' as a whole if it doesn't really get treated as a whole uniformly. 因此,没有任何支持来支持我的拙见,如果一个数组不能真正统一地作为一个整体对待,那么它就不能真正无效或整体“未定义”。

What does get treated uniformly are the individual elements. 得到均匀处理的是个别元素。 So I think it only makes sense to talk about whether accessing a specific element is valid or defined. 所以我认为只讨论访问特定元素是有效还是定义是有意义的。

For C++ (I'm using draft N4296) [dcl.array]/7 says in particular that if the result of subscripting is an array, it's immediately converted to pointer. 对于C ++(我正在使用草案N4296) [dcl.array]/7特别说如果下标的结果是一个数组,它会立即转换为指针。 That is, in ptr[0][0] ptr[0] is first converted to int* and only then second [0] is applied to it. 也就是说,在ptr[0][0]中,首先将ptr[0]转换为int* ,然后仅对其应用第二个[0] So it's perfectly valid code. 所以这是完全有效的代码。

For C (C11 draft N1570) 6.5.2.1/3 states the same. 对于C(C11草案N1570) 6.5.2.1/3陈述相同。

Yes, this is correct code. 是的,这是正确的代码。 Quoting N4140 for C++14: 引用N4140 for C ++ 14:

[expr.sub]/1 ... The expression E1[E2] is identical (by definition) to *((E1)+(E2)) [expr.sub] / 1 ...表达式E1[E2]*((E1)+(E2))相同(根据定义*((E1)+(E2))

[expr.add]/5 ... If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; [expr.add] / 5 ...如果指针操作数和结果都指向同一个数组对象的元素,或者指向数组对象的最后一个元素,则评估不应产生溢出; otherwise, the behavior is undefined. 否则,行为未定义。

There is no overflow here. 这里没有溢出。 &*(*(ptr)) == &ptr[0][0] == &arr[1] . &*(*(ptr)) == &ptr[0][0] == &arr[1]

For C11 (N1570) the rules are the same. 对于C11(N1570),规则是相同的。 §6.5.2.1 and §6.5.6 §6.5.2.1和§6.5.6

Let me give a dissenting opinion: this is (at least in C++) undefined behaviour, for much the same reason as in the other question that this question linked to. 让我给出一个反对意见:这是(至少在C ++中)未定义的行为,其原因与此问题所关联的其他问题的原因大致相同。

First let me clarify the example with some typedefs that will simplify the discussion. 首先让我用一些简化讨论的typedef来澄清这个例子。

typedef int two_ints[2];
typedef int* int_ptr;
typedef two_ints* two_ints_ptr;

two_ints arr;

two_ints_ptr ptr = (two_ints_ptr) &arr[1];

int_ptr temp = ptr[0]; // the two_ints value ptr[0] gets converted to int_ptr
temp[0] = 0;

So the question is whether, although there is no object of type two_ints whose address coincides with that of arr[1] (in the same sense that the adress of arr coincides with that of arr[0] ), and therefore no object to which ptr[0] could possibly point to, one can nonetheless convert the value of that expression to one of type int_ptr (here given the name temp ) that does point to an object (namely the integer object also called arr[1] ). 所以问题是,尽管没有类型为two_ints的对象,其地址与arr[1]对象一致(在同一意义上, arr的地址与arr[0]的地址重合),因此没有对象ptr[0]可能可能指向,一个可以在该表达式的值仍然转换成类型的一个int_ptr (这里给出的名称temp ), 指向的对象(即对象的整数也被称为arr[1]

The point where I think behaviour is undefined is in the evaluation of ptr[0] , which is equivalent (per 5.2.1[expr.sub]) to *(ptr+0) ; 我认为行为未定义的点是在ptr[0]的评估中,它是等价的(按5.2.1 [expr.sub])到*(ptr+0) ; more precisely the evaluation of ptr+0 has undefined behaviour. 更确切地说, ptr+0的评估具有未定义的行为。

I'll cite my copy of the C++ which is not official [N3337], but probably the language has not changed; 我会引用我的C ++副本,这不是官方的[N3337],但可能语言没有改变; what bothers me slightly is that the section number does not at all match the one mentioned at the accepted answer of the linked question. 令我困惑的是,章节编号根本不符合链接问题的已接受答案中提到的那个。 Anyway, for me it is §5.7[expr.add] 无论如何,对我来说是§5.7[expr.add]

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce overflow; 如果指针操作数和结果都指向同一个数组对象的元素,或者指向数组对象的最后一个元素,则评估不应产生溢出; otherwise the behavior is undefined. 否则行为未定义。

Since the pointer operand ptr has type pointer to two_ints , the "array object" mentioned in the cited text would have to be an array of two_ints objects. 由于指针操作数ptr具有指向two_ints类型指针,因此引用文本中提到的“数组对象”必须是two_ints对象的数组。 However there is only one such object here, the fictive array whose unique element is arr that we are supposed to conjure up in such situations (as per: "pointer to nonarray object behaves the same as a pointer to the first element of an array of length one..."), but clearly ptr does not point to its unique element arr . 然而,这里只有一个这样的对象,它的唯一元素是arr的虚拟数组,我们应该在这种情况下让人联想起来(根据:“指向非阵列对象的指针与指向数组的第一个元素的指针的行为相同长度一......“),但显然ptr 并没有指向它独特的元素arr So even though ptr and ptr+0 are no doubt equal values, neither of them point to elements of any array object at all (not even a fictive one), nor one past the end of such an array object, and the condition of the cited phrase is not met. 因此,即使ptrptr+0毫无疑问是相等的值,它们都没有指向任何数组对象的元素(甚至不是虚构的元素),也不是指向这样的数组对象的末尾,以及引用的短语不符合。 The consequence is (not that overflow is produced, but) that behavior is undefined. 结果是(不是产生溢出,但是)行为是未定义的。

So behavior is already undefined before the indirection operator * is applied. 因此,在应用间接运算符*之前,行为已经未定义。 I would not argue for undefined behavior from the latter evaluation, even though the phrase "the result is an lvalue referring to the object or function to which the expression points" is hard to interpret for expressions that do not refer to any object at all. 我不会争论后一种评估的未定义行为,即使短语“结果是指向表达点所指向的对象或函数的左值”很难解释为根本没有引用任何对象的表达式。 But I would be lenient in interpreting this, since I think dereferencing a pointer past an array should not itself be undefined behavior (for instance if used to initialise a reference). 但是我在解释这个问题时会很宽容,因为我认为解除引用数组的指针本身不应该是未定义的行为(例如,如果用于初始化引用)。

This would suggest that if instead of ptr[0][0] one wrote (*ptr)[0] or **ptr , then behaviour would not be undefined. 这表明,如果不是ptr[0][0] ,而是写了(*ptr)[0]**ptr ,那么行为就不会被定义。 This is curious, but it would not be the first time the C++ standard surprises me. 这很奇怪,但这不是C ++标准第一次让我感到惊讶。

It depends on what you mean by "correct". 这取决于你所说的“正确”。 You are doing a cast on the ptr to arr[1] . 你在ptr上做了一个演员到arr[1] In C++ this will probably be a reinterpret_cast . 在C ++中,这可能是一个reinterpret_cast C and C++ are languages which (most of the time) assume that the programmer knows what he is doing. C和C ++是(大多数时候)假设程序员知道他在做什么的语言。 That this code is buggy has nothing to do with the fact that it is valid C/C++ code. 这段代码错误与它是有效的C / C ++代码无关。

You are not violating any rules in the standards (as far as I can see). 您没有违反标准中的任何规则(据我所知)。

Trying to answer here why the code works on commonly used compilers: 试着回答这里为什么代码适用于常用的编译器:

int arr[2];

int (*ptr)[2] = (int (*)[2]) &arr[1];

printf("%p\n", (void*)ptr);
printf("%p\n", (void*)*ptr);
printf("%p\n", (void*)ptr[0]);

All lines print the same address on commonly used compilers. 所有行都在常用编译器上打印相同的地址。 So, ptr is an object for which *ptr represents the same memory location as ptr on commonly used compilers and therefore ptr[0] is really a pointer to arr[1] and therefore arr[0][0] is arr[1] . 因此, ptr是一个对象,其中*ptr表示与常用编译器上的ptr相同的内存位置,因此ptr[0]实际上是指向arr[1]的指针,因此arr[0][0]arr[1] So, the code assigns a value to arr[1] . 因此,代码为arr[1]赋值。

Now, let's suppose a perverse implementation where a pointer to an array (NOTE: I'm saying pointer to an array, ie &arr which has the type int(*)[] , not arr which means the same as &arr[0] and has the type int* ) is the pointer to the second byte within the array. 现在,让我们假设一个反常的实现,其中指向数组的指针(注意:我说的是指向数组的指针,即&arr的类型为int(*)[] ,而不是arr ,这意味着与&arr[0]int*类型是指向数组中第二个字节的指针。 Then dereferencing ptr is the same as subtracting 1 from ptr using char* arithmetic. 然后解除引用ptr与使用char* arithmetic从ptr减去1相同。 For structs and unions, it is guaranteed that pointer to such types is the same as pointer to the first element of such types, but in casting pointer to array into pointer no such guarantee was found for arrays (ie that pointer to an array would be the same as pointer to the first element of the array) and as a matter of fact @FUZxxl planned to file a defect report about the standard. 对于结构和联合,保证指向这些类型的指针与指向这些类型的第一个元素的指针相同,但是在将指向数组的指针转换为指针时,没有找到对数组的这种保证(即,指向数组的指针将是与指向数组的第一个元素的指针相同)事实上@FUZxxl计划提交有关标准的缺陷报告。 For such a perverse implementation, *ptr ie ptr[0] would not be the same as &arr[1] . 对于这种不正确的实现, *ptrptr[0]&arr[1] On RISC processors, it would as a matter of fact cause problems due to data alignment. 在RISC处理器上,事实上会由于数据对齐而导致问题。

Some additional fun: 一些额外的乐趣:

int arr[2] = {0, 0};
int *ptr = (int*)&arr;
ptr[0] = 5;
printf("%d\n", arr[0]);

Should that code work? 该代码应该有效吗? It prints 5. 它打印5。

Even more fun: 更有趣:

int arr[2] = {0, 0};
int (*ptr)[3] = (int(*)[3])&arr;
ptr[0][0] = 6;
printf("%d\n", arr[0]);

Should this work? 这有用吗? It prints 6. 它打印6。

This should obviously work: 这应该显然有效:

int arr[2] = {0, 0};
int (*ptr)[2] = &arr;
ptr[0][0] = 7;
printf("%d\n", arr[0]);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM