简体   繁体   English

严格的别名规则

[英]Strict aliasing rule

I'm reading notes about reinterpret_cast and it's aliasing rules ( http://en.cppreference.com/w/cpp/language/reinterpret_cast ).我正在阅读有关 reinterpret_cast 的注释及其别名规则( http://en.cppreference.com/w/cpp/language/reinterpret_cast )。

I wrote that code:我写了这段代码:

struct A
{
  int t;
};

char *buf = new char[sizeof(A)];

A *ptr = reinterpret_cast<A*>(buf);
ptr->t = 1;

A *ptr2 = reinterpret_cast<A*>(buf);
cout << ptr2->t;

I think these rules doesn't apply here:我认为这些规则在这里不适用:

  • T2 is the (possibly cv-qualified) dynamic type of the object T2 是对象的(可能是 cv 限定的)动态类型
  • T2 and T1 are both (possibly multi-level, possibly cv-qualified at each level) pointers to the same type T3 (since C++11) T2 和 T1 都是(可能是多级的,可能在每一级都有 cv 限定)指向相同类型 T3 的指针(C++11 起)
  • T2 is an aggregate type or a union type which holds one of the aforementioned types as an element or non-static member (including, recursively, elements of subaggregates and non-static data members of the contained unions): this makes it safe to cast from the first member of a struct and from an element of a union to the struct/union that contains it. T2 是聚合类型或联合类型,它将上述类型之一保存为元素或非静态成员(递归地包括子聚合的元素和所包含联合的非静态数据成员):这使得转换是安全的从结构的第一个成员和联合的元素到包含它的结构/联合。
  • T2 is the (possibly cv-qualified) signed or unsigned variant of the dynamic type of the object T2 是对象的动态类型的(可能是 cv 限定的)有符号或无符号变体
  • T2 is a (possibly cv-qualified) base class of the dynamic type of the object T2 是对象的动态类型的(可能是 cv 限定的)基类
  • T2 is char or unsigned char T2 是字符或无符号字符

In my opinion this code is incorrect.在我看来,这段代码是不正确的。 Am I right?我对吗? Is code correct or not?代码正确与否?

On the other hand what about connect function (man 2 connect) and struct sockaddr?另一方面,connect函数(man 2 connect)和struct sockaddr呢?

   int connect(int sockfd, const struct sockaddr *addr,
               socklen_t addrlen);

Eg.例如。 we have struct sockaddr_in and we have to cast it to struct sockaddr.我们有 struct sockaddr_in 并且我们必须将它转换为 struct sockaddr。 Above rules also doesn't apply, so is this cast incorrect?以上规则也不适用,所以这个演员不正确吗?

Yeah, it's invalid, but not because you're converting a char* to an A* : it's because you are not obtaining a A* that actually points to an A* and, as you've identified, none of the type aliasing options fit.是啊,这是无效的,但不是因为你转换char*A* :那是因为你没有获得A* ,实际上指向一个A* ,正如你已经确定,没有任何的类型锯齿选项合身。

You'd need something like this:你需要这样的东西:

#include <new>
#include <iostream>

struct A
{
  int t;
};

char *buf = new char[sizeof(A)];

A* ptr = new (buf) A;
ptr->t = 1;

// Also valid, because points to an actual constructed A!
A *ptr2 = reinterpret_cast<A*>(buf);
std::cout << ptr2->t;

Now type aliasing doesn't come into it at all (though keep reading because there's more to do!).现在类型别名根本不存在(尽管继续阅读,因为还有更多事情要做!)。

In reality, this is not enough.实际上,这还不够。 We must also consider alignment .我们还必须考虑对齐 Though the above code may appear to work, to be fully safe and whatnot you will need to placement- new into a properly-aligned region of storage, rather than just a casual block of char s.虽然上面的代码可能看起来有效,但为了完全安全,您需要将new放置到正确对齐的存储区域中,而不仅仅是一个随意的char块。

The standard library (since C++11) gives us std::aligned_storage to do this:标准库(自 C++11 起)为我们提供了std::aligned_storage来做到这一点:

using Storage = std::aligned_storage<sizeof(A), alignof(A)>::type;
auto* buf = new Storage;

Or, if you don't need to dynamically allocate it, just:或者,如果您不需要动态分配它,只需:

Storage data;

Then, do your placement-new:然后,做你的新安置:

new (buf) A();
// or: new(&data) A();

And to use it:并使用它:

auto ptr = reinterpret_cast<A*>(buf);
// or: auto ptr = reinterpret_cast<A*>(&data);

All in it looks like this:它看起来像这样:

#include <iostream>
#include <new>
#include <type_traits>

struct A
{
  int t;
};

int main()
{
    using Storage = std::aligned_storage<sizeof(A), alignof(A)>::type;

    auto* buf = new Storage;
    A* ptr = new(buf) A();

    ptr->t = 1;

    // Also valid, because points to an actual constructed A!
    A* ptr2 = reinterpret_cast<A*>(buf);
    std::cout << ptr2->t;
}

( live demo ) 现场演示

Even then, since C++17 this is somewhat more complicated;即便如此,由于 C++17 这有点复杂; see the relevant cppreference pages for more information and pay attention to std::launder .有关更多信息,请参阅相关的 cppreference 页面并注意std::launder

Of course, this whole thing appears contrived because you only want one A and therefore don't need array form;当然,这整个事情看起来很人为,因为您只需要一个A ,因此不需要数组形式; in fact, you'd just create a bog-standard A in the first place.实际上,您首先只需创建一个沼泽标准A But, assuming buf is actually larger in reality and you're creating an allocator or something similar, this makes some sense.但是,假设buf实际上实际上更大,并且您正在创建一个分配器或类似的东西,这是有道理的。

The C aliasing rules from which the rules of C++ were derived included a footnote specifying that the purpose of the rules was to say when things may alias.派生 C++ 规则的 C 别名规则包括一个脚注,指定规则的目的是说明事物何时可以别名。 The authors of the Standard didn't think it necessary to forbid implementations from applying the rules in needlessly restrictive fashion in cases where things don't alias, because they thought compiler writers would honor the proverb "Don't prevent the programmer from doing what needs to be done", which the authors of the Standard viewed as part of the Spirit of C.标准的作者认为没有必要禁止实现以不必要的限制方式在事物没有别名的情况下应用规则,因为他们认为编译器作者会尊重谚语“不要阻止程序员做什么需要完成”,标准的作者将其视为 C 精神的一部分。

Situations where it would be necessary to use an lvalue of an aggregate's member type to actually alias a value of the aggregate type are rare, so it's entirely reasonable that the Standard doesn't require compilers to recognize such aliasing.需要使用聚合成员类型的左值来实际为聚合类型的值设置别名的情况很少见,因此标准不要求编译器识别这种别名是完全合理的。 Applying the rules restrictively in cases that don't involve aliasing, however, would cause something like:但是,在不涉及别名的情况下限制性地应用规则会导致类似的情况:

union foo {int x; float y;} foo;
int *p = &foo.x;
*p = 1;

or even, for that matter,甚至,就此而言,

union foo {int x; float y;} foo;
foo.x = 1;

to invoke UB since the assignment is used to access the stored values of a union foo and a float using an int , which is not one of the allowed types.调用 UB,因为赋值用于使用int访问union foofloat的存储值,这不是允许的类型之一。 Any quality compiler, however, should be able to recognize that an operation done on an lvalue which is visibly freshly derived from a union foo is an access to a union foo , and an access to a union foo is allowed to affect the stored values of its members (like the float member in this case).任何质量然而,编译器应该能够认识到其上可见新鲜源于一个左值进行操作union foo是一个访问union foo ,和一个访问union foo允许影响的存储值它的成员(如本例中的float成员)。

The authors of the Standard probably declined to make the footnote normative because doing so would require a formal definition of when an access via freshly-derived lvalue is an access to the parent, and what kinds of access patterns constitute aliasing.该标准的作者可能拒绝使脚注规范化,因为这样做需要正式定义何时通过新派生的左值访问是对父级的访问,以及什么样的访问模式构成别名。 While most cases would be pretty clear cut, there are some corner cases which implementations intended for low-level programming should probably interpret more pessimistically than those intended for eg high-end number crunching, and the authors of the Standard figured that anyone who could figure out how to handle the harder cases should be able to handle the easy ones.虽然大多数情况都非常明确,但也有一些极端情况,用于低级编程的实现可能比用于高端数字运算的实现更悲观,并且标准的作者认为任何能够计算弄清楚如何处理较难的情况应该能够处理简单的情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM