简体   繁体   English

C别名规则和memcpy

[英]C aliasing rules and memcpy

While answering another question, I thought of the following example: 在回答另一个问题时,我想到了以下示例:

void *p;
unsigned x = 17;

assert(sizeof(void*) >= sizeof(unsigned));
*(unsigned*)&p = 17;        // (1)
memcpy(&p, &x, sizeof(x));  // (2)

Line 1 breaks aliasing rules. 第1行打破了别名规则。 Line 2, however, is OK wrt. 然而,第2行是好的。 aliasing rules. 别名规则。 The question is: why? 问题是:为什么? Does the compiler have special built-in knowledge about functions such as memcpy, or are there some other rules that make memcpy OK? 编译器是否具有关于memcpy等函数的特殊内置知识,还是有一些其他规则可以使memcpy正常运行? Is there a way of implementing memcpy-like functions in standard C without breaking the aliasing rules? 有没有办法在标准C中实现类似memcpy的函数而不破坏别名规则?

The C Standard is quite clear on it. C标准非常明确。 The effective type of the object named by p is void* , because it has a declared type, see 6.5/6 . p命名的对象的有效类型是void* ,因为它具有声明的类型,请参见6.5/6 The aliasing rules in C99 apply to reads and writes, and the write to void* through an unsigned lvalue in (1) is undefined behavior according to 6.5/7 . C99中的别名规则适用于读取写入,根据6.5/7通过(1)unsigned左值写入void*是未定义的行为。

In contrast, the memcpy of (2) is fine, because unsigned char* can alias any object ( 6.5/7 ). 相比之下, (2)memcpy很好,因为unsigned char*可以对任何对象进行别名( 6.5/7 )。 The Standard defines memcpy at 7.21.2/1 as 标准将memcpy定义为7.21.2/1

For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value). 对于本子条款中的所有函数,每个字符都应解释为它具有unsigned char类型(因此每个可能的对象表示都是有效的并且具有不同的值)。

The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. memcpy函数将s2指向的对象中的n个字符复制到s1指向的对象中。 If copying takes place between objects that overlap, the behavior is undefined. 如果在重叠的对象之间进行复制,则行为未定义。

However if there exist a use of p afterwards, that might cause undefined behavior depending on the bitpattern. 但是,如果之后存在p的使用,则可能会导致未定义的行为,具体取决于bitpattern。 If such a use does not happen, that code is fine in C. 如果没有发生这样的使用,那么该代码在C中就可以了。


According to the C++ Standard , which in my opinion is far from clear on the issue, i think the following holds. 根据C ++标准 ,我认为这个问题远非明确,我认为以下内容成立。 Please don't take this interpretation as the only possible - the vague/incomplete specification leaves a lot of room for speculation. 请不要将这种解释作为唯一可能的解释 - 模糊/不完整的规范留下了很大的猜测空间。

Line (1) is problematic because the alignment of &p might not be ok for the unsigned type. (1)是有问题的,因为对于unsigned类型, &p的对齐可能不正确。 It changes the type of the object stored in p to be unsigned int . 它将存储在p中的对象的类型更改为unsigned int As long as you don't access that object later on through p , aliasing rules are not broken, but alignment requirements might still be. 只要您以后不通过p访问该对象,别名规则就不会被破坏,但对齐要求可能仍然存在。

Line (2) however has no alignment problems, and is thus valid, as long as you don't access p afterwards as a void* , which might cause undefined behavior depending on how the void* type interprets the stored bitpattern. 然而,第(2)行没有对齐问题,因此只要你之后不将p作为void*访问p ,这可能会导致未定义的行为,具体取决于void* type如何解释存储的bitpattern。 I don't think that the type of the object is changed thereby. 我不认为对象的类型因此而改变。

There is a long GCC Bugreport that also discusses the implications of a write through a pointer that resulted from such a cast and what the difference to placement-new is (people on that list aren't agreeing what it is). 有一个很长的GCC Bugreport ,也讨论了通过这样一个演员导致的指针写入的含义以及与placement-new的区别(该列表中的人不同意它是什么)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM