简体   繁体   中英

Strict Aliasing Rule and Type Aliasing in C++

I am trying to get a grasp of undefined-behavior when violating the strict aliasing rule. I have read many articles on SO in order to understand it. However, one question remains: I do not really understand when two types illegaly alias. cpp-reference states:

Type aliasing

Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:

  • AliasedType and DynamicType are similar.
  • AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType.
  • AliasedType is std::byte, (since C++17)char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.

I also found a nice example on SO where I clearly see the issue:

int foo( float *f, int *i ) { 
    *i = 1;               
    *f = 0.f;            

   return *i;

int main() {
    int x = 0;

    std::cout << x << "\n";   // Expect 0
    x = foo(reinterpret_cast<float*>(&x), &x);
    std::cout << x << "\n";   // Expect 0?

int and float are non-similar types and this program possibly wreaks havoc. What I fail to see and understand is the following modification:

struct A
    int a;

struct B
    int b;

A foo( A *a, B *b ) { 
    a->a = 1;               
    b->b = 0;            

    return *a;

int main() {
    A a;
    a.a = 0;

    std::cout << a.a << "\n";   // Expect 0
    a = foo(&a, reinterpret_cast<B*>(&a));
    std::cout << a.a << "\n";   // Expect 0?

Are A and B similar types and everything is fine, or are they illegaly aliasing and I have undefined-behavior. And if it is legal, is this because A and B are aggregates (if yes, what would I have to change to make it undefined-behavior)?

Any heads-up and help would be very appreciated.

EDIT On the issue of being duplicate

I am aware of this post, but I do not see where they clarify what types are similar. At least not to an extend that I would understand it. Therefore it would be kind if you would not close this question.

No, it's not legal and you have Undefined Behavior:

8.2.1 Value category [basic.lval]

11 If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: 63

(11.1) — the dynamic type of the object,

(11.2) — a cv-qualified version of the dynamic type of the object,

(11.3) — a type similar (as defined in 7.5) to the dynamic type of the object,

(11.4) — a type that is the signed or unsigned type corresponding to the dynamic type of the object,

(11.5) — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

(11.6) — an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

(11.7) — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

(11.8) — a char, unsigned char, or std::byte type

63) The intent of this list is to specify those circumstances in which an object may or may not be aliased.

In the expression b->b = a; the undefined behavior is not due to the assignment, but to the class member access expression, b->b . If this expression were not UB your code would not be UB.

In [expr.ref]/1 it is specified that class member access constitue an access the object b (on the left side of ->):

A postfix expression followed by a dot . or an arrow ->, optionally followed by the keyword template ([temp.names]), and then followed by an id-expression, is a postfix expression. The postfix expression before the dot or arrow is evaluated;[67] the result of that evaluation, together with the id-expression, determines the result of the entire postfix expression.

[67] If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.

bold mine

So b->b read the value of the object a with an expression of type B and the rule you cite applies here.

Regarding similar types, the reinterpret_cast section has some helpful explanation and examples:

Informally, two types are similar if, ignoring top-level cv-qualification:

  • they are the same type; or
  • they are both pointers, and the pointed-to types are similar; or
  • they are both pointers to member of the same class, and the types of the pointed-to members are similar; or
  • they are both arrays of the same size or both arrays of unknown bound, and the array element types are similar.

For example:

  • const int * volatile * and int * * const are similar;
  • const int (* volatile S::* const)[20] and int (* const S::* volatile)[20] are similar;
  • int (* const *)(int *) and int (* volatile *)(int *) are similar;
  • int (S::*)() const and int (S::*)() are not similar;
  • int (*)(int *) and int (*)(const int *) are not similar;
  • const int (*)(int *) and int (*)(int *) are not similar;
  • int (*)(int * const) and int (*)(int *) are similar (they are the same type);
  • std::pair<int, int> and std::pair<const int, int> are not similar.

This rule enables type-based alias analysis, in which a compiler assumes that the value read through a glvalue of one type is not modified by a write to a glvalue of a different type (subject to the exceptions noted above).

Note that many C++ compilers relax this rule, as a non-standard language extension, to allow wrong-type access through the inactive member of a union (such access is not undefined in C

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM