简体   繁体   English

C++ 中的严格别名规则和类型别名

[英]Strict Aliasing Rule and Type Aliasing in C++

I am trying to get a grasp of undefined-behavior when violating the strict aliasing rule.在违反严格的别名规则时,我试图了解未定义的行为。 I have read many articles on SO in order to understand it.为了理解它,我已经阅读了很多关于 SO 的文章。 However, one question remains: I do not really understand when two types illegaly alias.然而,一个问题仍然存在:我真的不明白什么时候两种类型非法别名。 cpp-reference states: cpp-reference状态:

Type aliasing类型别名

Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:每当尝试通过 AliasedType 类型的泛左值读取或修改 DynamicType 类型的对象的存储值时,除非满足以下任一条件,否则行为未定义:

  • AliasedType and DynamicType are similar. AliasedType 和 DynamicType 类似。
  • AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType. AliasedType 是 DynamicType 的(可能是 cv 限定的)有符号或无符号变体。
  • AliasedType is std::byte, (since C++17)char, or unsigned char: this permits examination of the object representation of any object as an array of bytes. AliasedType 是 std::byte、(C++17 起)char 或 unsigned char:这允许检查任何对象作为字节数组的对象表示。

I also found a nice example on SO where I clearly see the issue:我还在SO上找到了一个很好的例子,我清楚地看到了这个问题:

int foo( float *f, int *i ) { 
    *i = 1;               
    *f = 0.f;            

   return *i;
}

int main() {
    int x = 0;

    std::cout << x << "\n";   // Expect 0
    x = foo(reinterpret_cast<float*>(&x), &x);
    std::cout << x << "\n";   // Expect 0?
}

int and float are non-similar types and this program possibly wreaks havoc. intfloat是不相似的类型,这个程序可能会造成严重破坏。 What I fail to see and understand is the following modification:我没有看到和理解的是以下修改:

struct A
{
    int a;
};

struct B
{
    int b;
};

A foo( A *a, B *b ) { 
    a->a = 1;               
    b->b = 0;            

    return *a;
}

int main() {
    A a;
    a.a = 0;


    std::cout << a.a << "\n";   // Expect 0
    a = foo(&a, reinterpret_cast<B*>(&a));
    std::cout << a.a << "\n";   // Expect 0?
}

Are A and B similar types and everything is fine, or are they illegaly aliasing and I have undefined-behavior. AB相似的类型,一切都很好,或者它们是非法别名,我有未定义的行为。 And if it is legal, is this because A and B are aggregates (if yes, what would I have to change to make it undefined-behavior)?如果它是合法的,这是因为AB是聚合(如果是,我必须更改什么才能使其成为未定义的行为)?

Any heads-up and help would be very appreciated.任何单挑和帮助将不胜感激。

EDIT On the issue of being duplicate编辑关于被重复的问题

I am aware of this post, but I do not see where they clarify what types are similar.我知道这篇文章,但我没有看到他们在哪里澄清了哪些类型是相似的。 At least not to an extend that I would understand it.至少不会达到我能理解的程度。 Therefore it would be kind if you would not close this question.因此,如果您不关闭这个问题,那就太好了。

No, it's not legal and you have Undefined Behavior:不,这是不合法的,您有未定义的行为:

8.2.1 Value category [basic.lval] 8.2.1 值类别[basic.lval]

11 If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: 63 11 如果程序尝试通过以下类型之一以外的泛左值访问对象的存储值,则行为未定义: 63

(11.1) — the dynamic type of the object, (11.1) — 对象的动态类型,

(11.2) — a cv-qualified version of the dynamic type of the object, (11.2) — 对象的动态类型的 cv 限定版本,

(11.3) — a type similar (as defined in 7.5) to the dynamic type of the object, (11.3) — 类似于(如 7.5 中定义的)对象的动态类型的类型,

(11.4) — a type that is the signed or unsigned type corresponding to the dynamic type of the object, (11.4) — 与对象的动态类型相对应的有符号或无符号类型,

(11.5) — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object, (11.5) — 一种类型,它是与对象动态类型的 cv 限定版本相对应的有符号或无符号类型,

(11.6) — an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union), (11.6) — 在其元素或非静态数据成员中包含上述类型之一的聚合或联合类型(递归地包括子聚合或包含联合的元素或非静态数据成员),

(11.7) — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object, (11.7) — 是对象动态类型的(可能是 cv 限定的)基类类型的类型,

(11.8) — a char, unsigned char, or std::byte type (11.8) — char、unsigned char 或 std::byte 类型


63) The intent of this list is to specify those circumstances in which an object may or may not be aliased. 63) 此列表的目的是指定对象可以或不可以别名的情况。

In the expression b->b = a;在表达式b->b = a; the undefined behavior is not due to the assignment, but to the class member access expression, b->b .未定义的行为不是由于赋值,而是由于类成员访问表达式b->b If this expression were not UB your code would not be UB.如果这个表达式不是 UB,你的代码就不是 UB。

In [expr.ref]/1 it is specified that class member access constitue an access the object b (on the left side of ->):[expr.ref]/1 中,指定类成员访问构成对对象b的访问(在 -> 的左侧):

A postfix expression followed by a dot .后缀表达式后跟一个点。 or an arrow ->, optionally followed by the keyword template ([temp.names]), and then followed by an id-expression, is a postfix expression.或一个箭头 ->,可选地后跟关键字模板 ([temp.names]),然后后跟一个 id 表达式,是后缀表达式。 The postfix expression before the dot or arrow is evaluated;[67] the result of that evaluation, together with the id-expression, determines the result of the entire postfix expression.计算点或箭头之前的后缀表达式;[67]该计算的结果与 id 表达式一起决定了整个后缀表达式的结果。

[67] If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member. [67] 如果对类成员访问表达式求值,即使结果不需要确定整个后缀表达式的值,例如如果 id 表达式表示静态成员,也会发生子表达式求值。

bold mine大胆的我

So b->b read the value of the object a with an expression of type B and the rule you cite applies here.因此b->b使用类型B的表达式读取对象a的值,并且您引用的规则适用于此处。

Regarding similar types, the reinterpret_cast section has some helpful explanation and examples:关于类似的类型, reinterpret_cast部分有一些有用的解释和示例:

Informally, two types are similar if, ignoring top-level cv-qualification:非正式地,两种类型是相似的 if,忽略顶级 cv 限定:

  • they are the same type;它们是同一类型; or或者
  • they are both pointers, and the pointed-to types are similar;它们都是指针,指向的类型相似; or或者
  • they are both pointers to member of the same class, and the types of the pointed-to members are similar;它们都是指向同一个类的成员的指针,所指向的成员的类型相似; or或者
  • they are both arrays of the same size or both arrays of unknown bound, and the array element types are similar.它们都是相同大小的数组或都是未知边界的数组,数组元素类型相似。

For example:例如:

  • const int * volatile * and int * * const are similar; const int * volatile *int * * const 类似;
  • const int (* volatile S::* const)[20] and int (* const S::* volatile)[20] are similar; const int (* volatile S::* const)[20]int (* const S::* volatile)[20]类似;
  • int (* const *)(int *) and int (* volatile *)(int *) are similar; int (* const *)(int *)int (* volatile *)(int *)类似;
  • int (S::*)() const and int (S::*)() are not similar; int (S::*)() constint (S::*)()不相似;
  • int (*)(int *) and int (*)(const int *) are not similar; int (*)(int *)int (*)(const int *)不相似;
  • const int (*)(int *) and int (*)(int *) are not similar; const int (*)(int *)int (*)(int *)不相似;
  • int (*)(int * const) and int (*)(int *) are similar (they are the same type); int (*)(int * const)int (*)(int *)类似(它们是相同的类型);
  • std::pair<int, int> and std::pair<const int, int> are not similar. std::pair<int, int>std::pair<const int, int>不相似。

This rule enables type-based alias analysis, in which a compiler assumes that the value read through a glvalue of one type is not modified by a write to a glvalue of a different type (subject to the exceptions noted above).此规则启用基于类型的别名分析,其中编译器假定通过一种类型的泛左值读取的值不会因写入不同类型的泛左值而被修改(除上述例外情况外)。

Note that many C++ compilers relax this rule, as a non-standard language extension, to allow wrong-type access through the inactive member of a union (such access is not undefined in C请注意,许多 C++ 编译器放宽了此规则,作为非标准语言扩展,允许通过联合的非活动成员进行错误类型的访问(此类访问在 C 中并非未定义)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM