简体   繁体   中英

Strict aliasing rule and 'char *' pointers

The accepted answer to What is the strict aliasing rule? mentions that you can use char * to alias another type but not the other way.

It doesn't make sense to me — if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?

if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?

It does, but that's not the point.

The point is that if you have one or more struct something s then you may use a char* to read their constituent bytes, but if you have one or more char s then you may not use a struct something* to read them.

The wording in the referenced answer is slightly erroneous, so let's get that ironed out first: One object never aliases another object , but two pointers can “alias” the same object (meaning, the pointers point to the same memory location — as MM pointed out, this is still not 100% correct wording but you get the idea). Also, the standard itself doesn't (to the best of my knowledge) actually talk about strict aliasing at all, but merely lays out rules that govern through which kinds of expressions an object may be accessed or not. Compiler flags like -fno-strict-aliasing tell the compiler whether it can assume the programmer followed those rules (so it can perform optimizations based on that assumption) or not.

Now to your question: Any object can be accessed through a pointer to char , but a char object (especially a char array) may not be accessed through most other pointer types. Based on that, the compiler is required to make the following assumptions:

  1. If the type of the actual object itself is not known, both char* and T* pointers may always point to the same object (alias each other) — symmetric relationship .
  2. If types T1 and T2 are not “related” and neither is char , then T1* and T2* may never point to the same object — symmetric relationship .
  3. A char* pointer may point to a char object or an object of any type T .
  4. A T* pointer may not point to a char object — a symmetric relationship .

I believe, the main rationale behind the asymmetric rules about accessing object through pointers is that a char array might not satisfy the alignment requirements of, eg, an int .

So, even without compiler optimizations based on the strict aliasing rule, writing an int to the location of a 4-byte char array at addresses 0x1, 0x2, 0x3, 0x4, for instance, will — in the best case — result in poor performance and — in the worst case — access a different memory location, because the CPU instructions might ignore the lowest two address bits when writing a 4-byte value (so here this might result in a write to 0x0, 0x1, 0x2, and 0x3).

Please also be aware that the meaning of “related” differs between C and C++, but that is not relevant to your question.

if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?

Pointers don't alias each other; that's sloppy use of language. Aliasing is when an lvalue is used to access an object of a different type. (Dereferencing a pointer gives an lvalue).

In your example, what's important is the type of the object being aliased. For a concrete example let's say that the object is a double . Accessing the double by dereferencing a char * pointing at the double is fine because the strict aliasing rule permits this. However, accessing a double by dereferencing a struct something * is not permitted (unless, arguably, the struct starts with double !).

If the compiler is looking at a function which takes char * and struct something * , and it does not have available the information about the object being pointed to (this is actually unlikely as aliasing passes are done at a whole-program optimization stage); then it would have to allow for the possibility that the object might actually be a struct something * , so no optimization could be done inside this function.

Many aspects of the C++ Standard are derived from the C Standard, which needs to be understood in the historical context when it was written. If the C Standard were being written to describe a new language which included type-based aliasing, rather than describing an existing language which was designed around the idea that accesses to lvalues were accesses to bit patterns stored in memory, there would be no reason to give any kind of privileged status to the type used for storing characters in a string. Having explicit operations to treat regions of storage as bit patterns would allow optimizations to be simultaneously more effective and safer. Had the C Standard been written in such fashion, the C++ Standard presumably would have been likewise.

As it is, however, the Standard was written to describe a language in which a very common idiom was to copy the values of objects by copying all of the bytes thereof, and the authors of the Standard wanted to allow such constructs to be usable within portable programs.

Further, the authors of the Standard intended that implementations process many non-portable constructs "in a documented manner characteristic of the environment" in cases where doing so would be useful, but waived jurisdiction over when that should happen, since compiler writers were expected to understand their customers' and prospective customers' needs far better than the Committee ever could.

Suppose that in one compilation unit, one has the function:

void copy_thing(char *dest, char *src, int size)
{
  while(size--)
    *(char volatile *)(dest++) = *(char volatile*)(src++);
}

and in another compilation unit:

float f1,f2;
float test(void)
{
  f1 = 1.0f;
  f2 = 2.0f;
  copy_thing((char*)&f2, (char*)&f1, sizeof f1);
  return f2;
}

I think there would have been a consensus among Committee members that no quality implementation should treat the fact that copy_thing never writes to an object of type float as an invitation to assume that the return value will always be 2.0f. There are many things about the above code that should prevent or discourage an implementation from consolidating the read of f2 with the preceding write, with or without a special rule regarding character types, but different implementations would have different reasons for their forfearance.

It would be difficult to describe a set of rules which would require that all implementations process the above code correctly without blocking some existing or plausible implementations from implementing what would otherwise be useful optimizations. An implementation that treated all inter-module calls as opaque would handle such code correctly even if it was oblivious to the fact that a cast from T1 to T2 is a sign that an access to a T2 may affect a T1, or the fact that a volatile access might affect other objects in ways a compiler shouldn't expect to understand. An implementation that performed cross-module in-lining and was oblivious to the implications of typecasts or volatile would process such code correctly if it refrained from making any aliasing assumptions about accesses via character pointers.

The Committee wanted to recognize something in the above construct that compilers would be required to recognize as implying that f2 might be modified, since the alternative would be to view such a construct as Undefined Behavior despite the fact that it should be usable within portable programs. The fact that they chose the fact that the access was made via character pointer was the aspect that forced the issue was never intended to imply that compilers be oblivious to everything else, even though unfortunately some compiler writers interpret the Standard as an invitation to do just that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM