简体   繁体   English

指针算术通过先前的成员地址(在同一个结构中)导致指向另一个结构成员的指针

[英]Pointer arithmetic result in pointer to another struct member via previous member address (in the same struct)

What is the view from C standard about pointer arithmetic result in pointer to another struct member via previous member address in the same struct? C 标准关于指针算术结果通过同一结构中的前一个成员地址指向另一个结构成员的指针有何看法?


Code 1 (without struct), mystery_1代码 1(无结构),mystery_1

int mystery_1(void)
{
    int one = 1, two = 2;
    int *p1 = &one + 1;
    int *p2 = &two;
    unsigned long i1 = (unsigned long) p1;
    unsigned long i2 = (unsigned long) p2;

    if (i1 == i2)
        return p1 == p2;
    return 2;
}

From code 1, I know that the result is not determined, because there is no guarantee how local variables on the stack lay.从代码1,我知道结果是不确定的,因为无法保证堆栈上的局部变量是如何放置的。

What if I use struct like this (code 2)?如果我使用这样的结构(代码 2)怎么办?


Code 2 (with struct), mystery_2代码 2(带结构),mystery_2

int mystery_2(void)
{
    struct { int one, two; } my_var = {
        .one = 1, .two = 2
    };
    int *p1 = &my_var.one + 1;
    int *p2 = &my_var.two;
    unsigned long i1 = (unsigned long) p1;
    unsigned long i2 = (unsigned long) p2;
    
    if (i1 == i2)
        return p1 == p2;
    return 2;
}

Compilers Output编译器 Output

Godbolt link: https://godbolt.org/z/jGoKfETn7 Godbolt链接: https://godbolt.org/z/jGoKfETn7

GCC 10.2 GCC 10.2

mystery_1:
        xorl    %eax, %eax # return 0, while clang returns 2 (fine as no guarantee)
        ret
mystery_2:
        movl    $1, %eax # return 1, as compiler must consider the memory order of struct members
        ret

Clang 11.0.1 Clang 11.0.1

mystery_1:                              # @mystery_1
        movl    $2, %eax # return 2, while gcc returns 0 (fine as no guarantee)
        retq
mystery_2:                              # @mystery_2
        movl    $1, %eax # return 1, as compiler must consider the memory order of struct members
        retq

My understanding我的理解

  • In code 1, the return value is not determined, because there is no guarantee about memory layout of local variables on the stack.在代码 1 中,没有确定返回值,因为不能保证 memory 在堆栈上的局部变量布局。
  • In code 2, the return value is determined and well-defined as 1 as p1 == p2 yields true, because struct guarantees the memory layout.在代码 2 中,返回值被确定并明确定义为1 ,因为p1 == p2产生 true,因为 struct 保证 memory 布局。 So next address of my_var.one is my_var.two , and compiler is not allowed to assume that p1 and p2 is different because of their provenance.所以my_var.two my_var.one并且不允许编译器假设p1p2因为它们的出处而不同。

Questions问题

  • Is my understanding correct?我的理解正确吗?
  • According to C standard, does mystery_2 always return 1 as p1 == p2 yields true?根据mystery_2标准,神秘_2 是否总是返回 1,因为p1 == p2为真?
  • In mystery_2 , is compiler allowed to assume that p1 != p2 , so the function returns 0?mystery_2中,是否允许编译器假设p1 != p2 ,所以 function 返回 0?

The problem问题

I had a discussion with someone regarding the struct case ( mystery_2 ), they said that:我与某人讨论了结构案例( mystery_2 ),他们说:

p1 points to (one past) one, and p2 points to two. p1指向(过去)一, p2指向二。 Those are, in C spec, counted as different "objects".这些在 C 规范中被视为不同的“对象”。 The spec then goes on to define that pointers to different objects might compare as different, even though both pointers have the exact same bit pattern然后规范继续定义指向不同对象的指针可能比较不同,即使两个指针具有完全相同的位模式

Is my understanding correct?我的理解正确吗?

No.不。

You're correct about the local variables;您对局部变量是正确的; but not for the struct example.但不适用于结构示例。

According to C standard, does mystery_2 always return 1 as p1 == p2 yields true?根据 C 标准,神秘_2 是否总是返回 1,因为 p1 == p2 为真?

No. That's not guaranteed by the C standard.不,C 标准不保证这一点。 Because there can be padding between one and two .因为onetwo之间可以有填充。

Practically, there's no reason for any compiler to insert padding between them in this example.实际上,在示例中,任何编译器都没有理由在它们之间插入填充。 And you can nearly always expect mystery_2 to return 1. But this is not required by the C standard and thus a pathological compiler could insert padding between one and two and that'd be perfectly valid.而且您几乎总是可以期望mystery_2返回 1。但这不是 C 标准所要求的,因此病态编译器可以在onetwo之间插入填充,这是完全有效的。

With respect to padding: The only guarantee is that there can't be any padding before the first member of a struct.关于填充:唯一的保证是在结构的第一个成员之前不能有任何填充。 So a pointer to a struct and a pointer to its first member are guaranteed to be the same.所以指向结构的指针和指向其第一个成员的指针保证是相同的。 No other guarantees whatsoever.没有任何其他保证。

Note: you should be using uinptr_t for storing pointer values ( unsigned long isn't guaranteed to be able to hold a pointer value).注意:您应该使用uinptr_t来存储指针值( unsigned long不能保证能够保存指针值)。

Two basics of pointer arithmetic are, per C 2018 6.5.6 8:根据 C 2018 6.5.6 8,指针算术的两个基础是:

  • A pointer to an element of an array may be adjusted (by addition and subtraction of an integer) to point to any element of the array or to the end (one beyond the last element).可以调整指向数组元素的指针(通过整数的加减法)以指向数组的任何元素或指向末尾(最后一个元素之后的一个)。 Arithmetic outside that is not defined by the C standard. C 标准未定义的算术外部。
  • For pointer arithmetic, a single object acts like an array of one object.对于指针运算,单个 object 的行为类似于一个 object 的数组。

Therefore int *p1 = &one + 1;因此int *p1 = &one + 1; has defined behavior.已定义行为。

Regarding:关于:

    unsigned long i1 = (unsigned long) p1;
    unsigned long i2 = (unsigned long) p2;

Since it is not the focus of this question, let's assume the implementation-defined conversion of a pointer to an unsigned long produces a unique value that uniquely identifies the pointer value.由于这不是这个问题的重点,我们假设实现定义的指针到unsigned long整数的转换会产生一个唯一值,该值唯一地标识指针值。 (That is, conversion of any address to an unsigned long only ever produces one value for that address, and conversion of the value back to a pointer reproduces the address. The C standard does not guarantee this.) (也就是说,将任何地址转换为unsigned long整数只会为该地址生成一个值,而将值转换回指针会重现该地址。C 标准不保证这一点。)

Then, if i1 == i2 , it implies p1 == p2 and vice-versa.然后,如果i1 == i2 ,则意味着p1 == p2 ,反之亦然。 Per C 2018 6.5.9 6, p1 and p2 can compare equal only if two (which p2 points to) has been laid out in memory one beyond one (which p1 points just beyond).根据 C 2018 6.5.9 6, p1p2只有在 memory one布置了twop2指向)时才能比较相等( p1指向刚刚超出)。 (In general, pointers can compare equal for other reasons, but those cases involve pointers to the same object, a structure and its first member, the same function, and so on, all of which are ruled out for this particular p1 and p2 .) (一般来说,由于其他原因,指针可以比较相等,但这些情况涉及指向相同 object、结构及其第一个成员、相同 function 等的指针,所有这些都被排除在这个特定的p1p2之外。 )

So the code in Code 1 will return 1 if two is laid out in memory just after one and 2 otherwise.因此,如果代码 1 中的代码在 memory 中布局two之后将返回one ,否则将返回 2。

The same is true in Code 2. The pointer arithmetic &my_var.one + 1 is defined, and the resulting p1 compares equal to p2 if and only if the member two immediately follows the member one in memory.在代码 2 中也是如此。定义了指针算术&my_var.one + 1 ,当且仅当成员two紧跟 memory 中的成员one时,结果p1比较等于p2

However, two does not have to immediately follow one .不过, two也不一定要马上跟着one This statement is incorrect:这种说法是不正确的:

… struct guarantees the memory layout. ... struct 保证 memory 布局。

The C standard allows implementations to put padding between structure members. C 标准允许实现在结构成员之间放置填充。 Common C implementations will not do this for struct { int one, two; }常见的 C 实现不会对struct { int one, two; } struct { int one, two; } because it is not needed for alignment (once one is aligned, the address immediately following it is also suitably aligned for int , so no padding is needed), but C standard does not guarantee it.因为struct { int one, two; }不需要它(一旦对齐,紧随其后one地址也适合int对齐,因此不需要填充),但 C 标准不保证。

Notes笔记

uintptr_t , declared in <stdint.h> , is a better choice for converting pointers to integers.<stdint.h>中声明的uintptr_t是将指针转换为整数的更好选择。 However, the standard only guarantees that (uintptr_t) px == (uintptr_t) py implies px == py , not that px == py implies (uintptr_t) px == (uintptr_t) py .但是,该标准仅保证(uintptr_t) px == (uintptr_t) py意味着px == py ,而不是px == py意味着(uintptr_t) px == (uintptr_t) py In other words, converting two pointers to the same object to uintptr_t might produce two different values, although converting them back to pointers will result in pointers that compare as equal.换句话说,将两个指向相同 object 的指针转换为uintptr_t可能会产生两个不同的值,尽管将它们转换回指针会导致指针比较相等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM