简体   繁体   English

可以使用指向结构成员的指针来访问同一结构的另一个成员吗?

[英]Can a pointer to a member of a struct be used to access another member of the same struct?

I'm trying to understand how type-punning works when it comes to storing a value into a member of structure or union. 我试图理解当将值存储到结构或联合的成员中时,类型惩罚是如何工作的。

The Standard N1570 6.2.6.1(p6) specifies that 标准N1570 6.2.6.1(p6)规定了

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. 当值存储在结构或联合类型的对象中(包括在成员对象中)时,对应于任何填充字节的对象表示的字节采用未指定的值。

So I interpreted it as if we have an object to store into a member such that the size of the object equals the sizeof(declared_type_of_the_member) + padding the bytes related to padding will have unspecified value (even in spite of the fact that we had the bytes in the original object defined). 所以我把它解释为好像我们有一个对象存储到一个成员中,使得对象的大小等于sizeof(declared_type_of_the_member) + padding与padding相关的字节将具有未指定的值(即使我们已经拥有了定义原始对象中的字节)。 Here is an example: 这是一个例子:

struct first_member_padded_t{
    int a;
    long b;
};

int a = 10;
struct first_member_padded_t s;
char repr[offsetof(struct first_member_padded_t, b)] = //some value
memcpy(repr, &a, sizeof(a));
memcpy(&(s.a), repr, sizeof(repr));
s.b = 100;
printf("%d%ld\n", s.a, s.b); //prints 10100

On my machine sizeof(int) = 4, offsetof(struct first_member_padded_t, b) = 8 . 在我的机器sizeof(int) = 4, offsetof(struct first_member_padded_t, b) = 8

Is the behavior of printing 10100 well defined for such a program? 是否为这样的程序定义了打印10100的行为? I thing that it is. 我觉得它是。

What the memcpy Calls Do memcpy呼叫做什么

The question is poorly posed. 这个问题很糟糕。 Let's look first at the code: 我们先来看看代码:

char repr[offsetof(struct first_member_padded_t, b)] = //some value
memcpy(repr, &a, sizeof(a));
memcpy(&(s.a), repr, sizeof(repr));

First note that repr is initialized, so all the elements in it are given values. 首先请注意, repr已初始化,因此其中的所有元素都是给定的值。

The first memcpy is fine—it copies the bytes of a into repr . 第一个memcpy很好 - 它将a的字节复制到repr

If the second memcpy were memcpy(&s, repr, sizeof repr); 如果第二个memcpymemcpy(&s, repr, sizeof repr); , it would copy bytes from repr into s . ,它会将repr字节复制到s This would write bytes into sa and, due to the size of repr , into any padding between sa and sb . 这会将字节写入sa并且由于repr的大小,将其写入sasb之间的任何填充。 Per C 2018 6.5 7 and other pats of the standard, it is permitted to access the bytes of an object (and “access” means both reading and writing, per 3.1 1). 根据标准的其他标准,允许访问对象的字节(“访问”表示读取和写入,每3.1 1)。 So this copy into s is fine, and it results in sa taking on the same value that a has. 因此,将此副本复制到s就可以了,并且它会导致saa具有相同的值。

However, the memcpy uses &(sa) rather than &s . 但是, memcpy使用&(sa)而不是&s It uses the address of sa rather than the address of s . 它使用sa的地址而不是s的地址。 We know that converting sa to a pointer to a character type would allow us to access the bytes of sa (6.5 7 and more) (and passing it to memcpy has the same effect as such a conversion, as memcpy is specified to have the effect of copying bytes), but it is not clear it allows us to access other bytes in s . 我们知道将sa转换为指向字符类型的指针将允许我们访问sa (6.5 7及更多)的字节(并将其传递给memcpy与此类转换具有相同的效果,因为指定memcpy具有效果复制字节),但不清楚它允许我们访问s其他字节。 In other words, we have a question of whether we can use &s.a to access bytes other than those in sa . 换句话说,我们有一个问题是我们是否可以使用&s.a来访问sa字节以外的字节。

6.7.2.1 15 tells us that, if a pointer to the first member of a structure is “suitably converted,” the result points to the structure. 6.7.2.1 15告诉我们,如果指向结构的第一个成员的指针被“适当转换”,则结果指向结构。 So, if we converted &s.a to a pointer to struct first_member_padding_t , it would point to s , and we can certainly use a pointer to s to access all the bytes in s . 所以,如果我们转换&s.a的指针struct first_member_padding_t ,它会指向s ,我们当然可以用一个指针s访问所有字节s Thus, this would also be well defined: 因此,这也将很好地定义:

memcpy((struct first_member_padding t *) &s.a, repr, sizeof repr);

However, memcpy(&s.a, repr, sizeof repr); 但是, memcpy(&s.a, repr, sizeof repr); only converts &s.a to void * (because memcpy is declared to take a void * , so &s.a is automatically converted during the function call) and not to a pointer to the structure type. 只将&s.a转换为void * (因为memcpy声明为void * ,因此&s.a在函数调用期间自动转换)而不是指向结构类型的指针。 Is that a suitable conversion? 这是一个合适的转换吗? Note that if we did memcpy(&s, repr, sizeof repr); 注意,如果我们做了memcpy(&s, repr, sizeof repr); , it would convert &s to void * . ,它会将&s转换为void * 6.2.5 28 tells us that a pointer to void has the same representation as a pointer to a character type. 6.2.5 28告诉我们指向void的指针与指向字符类型的指针具有相同的表示形式。 So consider these two statements: 请考虑以下两个陈述:

memcpy(&s.a, repr, sizeof repr);
memcpy(&s,   repr, sizeof repr);

Both of these statements pass a void * to memcpy , and those two void * have the same representation as each other and point to the same byte. 这两个语句都将void *传递给memcpy ,这两个void *具有相同的表示形式,并指向相同的字节。 Now, we might interpret the standard pedantically and strictly so that they are different in that the latter may be used to access all the bytes of s and the former may not. 现在,我们可以迂腐而严格地解释标准,以便它们不同,因为后者可用于访问s所有字节而前者可能不会。 Then it is bizarre that we have two necessarily identical pointers that behave differently. 然后奇怪的是,我们有两个必然相同的指针,表现不同。

Such a severe interpretation of the C standard seems possible in theory—the difference between the pointers could arise during optimization rather than in the actual implementation of memcpy —but I am not aware of any compiler that would do this. 理论上对C标准的这种严格解释似乎是可能的 - 指针之间的差异可能在优化期间而不是在memcpy的实际实现中出现 - 但我不知道任何编译器会这样做。 Note that such an interpretation is at odds with section 6.2 of the standard, which tells us about types and representations. 请注意,这种解释与标准的第6.2节不一致,后者告诉我们类型和表示。 Interpreting the standard so that (void *) &s.a and (void *) &s behave differently means that two things with the same value and type may behave differently, which means a value consists of something more than its value and type, which does not seem to be the intent of 6.2 or the standard generally. 解释标准使得(void *) &s.a(void *) &s表现不同意味着具有相同值和类型的两个事物可能表现不同,这意味着一个值包含的值超过其值和类型,似乎一般不是6.2或标准的意图。

Type-Punning 型双关

The question states: 问题是:

I'm trying to understand how type-punning works when it comes to storing a value into a member of structure or union. 我试图理解当将值存储到结构或联合的成员中时,类型惩罚是如何工作的。

This is not type-punning as the term is commonly used. 这不是类型惩罚,因为这个术语是常用的。 Technically, the code does access sa using lvalues of a different type than its definition (because it uses memcpy , which is defined to copy as if with character type, while the defined type is int ), but the bytes originate in an int and are copied without modification, and this sort of copying the bytes of an object is generally regarded as a mechanical procedure; 从技术上讲,代码使用与其定义不同类型的左值来访问sa (因为它使用memcpy ,它被定义为像字符类型一样进行复制,而定义的类型是int ),但是字节来自int并且是复制而不修改,这种复制对象的字节通常被视为机械过程; it is done to effect a copy and not to reinterpret the bytes in a new type. 它完成了一个副本而不是重新解释一个新类型的字节。 “Type-punning” usually refers to using different lvalues for the purpose of reinterpreting the value, such as writing an unsigned int and reading a float . “类型惩罚”通常是指使用不同的左值来重新解释值,例如编写unsigned int和读取float

In any case, type-punning is not really the subject of the question. 无论如何,打字并不是问题的主题。

Values In Members 成员中的价值观

The title asks: 标题问:

What values can we store in a struct or union members? 我们可以在结构或联盟成员中存储什么值?

This title seems off from the content of the question. 这个标题似乎与问题的内容有关。 The title question is easily answered: The values we can store in a member are those values the member's type can represent. 标题问题很容易回答:我们可以存储成员中的值是成员类型可以表示的值。 But the question goes on to explore the padding between members. 但问题是继续探讨成员之间的填充。 The padding does not affect the values in the members. 填充不会影响成员中的值。

Padding Takes Unspecified Values 填充需要未指定的值

The question quotes the standard: 问题引用了标准:

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. 当值存储在结构或联合类型的对象中(包括在成员对象中)时,对应于任何填充字节的对象表示的字节采用未指定的值。

and says: 并说:

So I interpreted it as if we have an object to store into a member such that the size of the object equals the s izeof(declared_type_of_the_member) + padding the bytes related to padding will have unspecified value… 所以我把它解释为好像我们有一个对象存储到一个成员中,使得对象的大小等于s izeof(declared_type_of_the_member) + padding与填充相关的字节将具有未指定的值...

The quoted text in the standard means that, if the padding bytes in s have been set to some values, as with memcpy , and we then do sa = something; 标准中引用的文本意味着,如果s的填充字节已设置为某些值,则与memcpy ,然后我们执行sa = something; , then the padding bytes are no longer required to hold their previous values. ,然后不再需要填充字节来保存其先前的值。

The code in the question explores a different situation. 问题中的代码探讨了不同的情况。 The code memcpy(&(sa), repr, sizeof(repr)); 代码memcpy(&(sa), repr, sizeof(repr)); does not store a value in a member of the structure in the sense meant in 6.2.6.1 6. It is not storing into either of the members sa or sb . 在6.2.6.1中意义上的结构成员中不存储值。它不存储在成员sasb It is copying bytes in, which is a different thing from what is discussed in 6.2.6.1. 它正在复制字节,这与6.2.6.1中讨论的不同。

6.2.6.1 6 means that, for example, if we execute this code: 6.2.6.1 6表示,例如,如果我们执行此代码:

char repr[sizeof s] = { 0 };
memcpy(&s, repr, sizeof s); // Set all the bytes of s to known values.
s.a = 0; // Store a value in a member.
memcpy(repr, &s, sizeof s); // Get all the bytes of s to examine them.
for (size_t i = sizeof s.a; i < offsetof(struct first_member_padding_t, b); ++i)
    printf("Byte %zu = %d.\n", i, repr[i]);

then it is not necessarily true that all zeros will be printed—the bytes in the padding may have changed. 那么所有的零都不会被打印 - 填充中的字节可能已经改变了。

In many implementations of the language the C Standard was written to describe, an attempt to write an N-byte object within a struct or union would affect the value of at most N bytes within the struct or union. 在编写C标准的语言的许多实现中,尝试在结构或联合内编写N字节对象将影响结构或联合内最多N个字节的值。 On the other hand, on a platform which supported 8-bit and 32-bit stores, but not 16-bit stores, if someone declared a type like: 另一方面,在支持8位和32位存储但不支持16位存储的平台上,如果有人声明了类似的类型:

struct S { uint32_t x; uint16_t y;} *s;

and then executed s->y = 23; 然后执行s->y = 23; without caring about what happened to the two bytes following y , it would be faster to performs a 32-bit store to y , blindly overwriting the two bytes following it, than to perform a pair of 8-bit writes to update the upper and lower halves of y . 如果不关心y的两个字节发生了什么,那么对y执行32位存储会更快,盲目地覆盖它后面的两个字节,而不是执行一对8位写操作来更新上部和下部的一半y The authors of the Standard didn't want to forbid such treatment. 该标准的作者不希望禁止这种治疗。

It would have been helpful if the Standard had included a means by which implementations could indicate whether writes to structure or union members might disturb storage beyond them, and programs that would be broken by such disturbance could refuse to run on implementations where it could occur. 如果标准包含了一种方法,通过该方法可以指示对结构或联合成员的写入是否会干扰超出它们的存储,并且可能会被这种干扰打破的程序可能拒绝在可能发生的实现上运行,这将是有用的。 The authors of the Standard, however, likely expected that programmers who would be interested in such details would know what kinds of hardware their program was expected to run on, and thus know whether such memory disturbances would be an issue on such hardware. 然而,该标准的作者可能期望那些对这些细节感兴趣的程序员会知道他们的程序预期会运行什么类型的硬件,从而知道这种内存干扰是否会成为这类硬件的问题。

Unfortunately, modern compiler writers seem to interpret freedoms that were intended to assist implementations for unusual hardware as an open invitation to get "creative" even when targeting platforms that could process code efficiently without such concessions. 不幸的是,现代编译器编写者似乎解释了旨在帮助实现异常硬件的自由作为一种开放的邀请来获得“创造性”,即使是在没有这种让步的情况下有效处理代码的平台。

As @user694733 said, in case there is padding between sa and sb , memcpy() is accessing a memory area that cannot be accessed by &a : 正如@ user694733所说,如果在sasb之间存在填充, memcpy()正在访问一个无法通过&a访问的内存区域:

int a = 1;
int b;
b = *((char *)&a + sizeof(int));

This is Undefined Behaviour, and it is basically what is happening inside memcpy() . 这是未定义的行为,它基本上是在memcpy()发生的事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM