简体   繁体   中英

Interpreting unsigned char array as bool array

Consider the following code. Is it ok or does it lead to undefined behaviour?

#include <iostream>

int main()
{
    {
        unsigned char binary[] = {0, 5, 10};
        bool* x = reinterpret_cast<bool*>(&binary[0]);

        for (unsigned int i = 0; i < 3; ++i)
        {
            std::cout << (x[i] ? 1 : 0) << " ";
        }
    }

    {
        unsigned char b = 255;
        bool* x = reinterpret_cast<bool*>(&b);
        std::cout << (*x ? 1 : 0) << std::endl;
    }

    return 0;
}

Output when compiled with gcc 4.6 to 4.8 is

0 5 10 1

but only with optimization ( -O1 and more). Clang results in

0 1 1 1

even with optimization. Now if change y[i] ? 1 : 0 y[i] ? 1 : 0 to y[i] ? 2 : 1 y[i] ? 2 : 1 gcc results is

1 2 2 1 .

Any ideas or is it simply undefined behaviour because of the cast?

The standard does not guarantee that bool is at all compatible with char (ie it does not guarantee that they have the same size or alignment), saying:

Values of type bool are either true or false .

§3.9.1 [basic.fundamental]

It also says:

Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false .

Footnote 47 (N3337)

Therefore you are in the realm of undefined behaviour.

Note that since the standard does not make an exception for bool , the following rules apply:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

— the dynamic type of the object,

— a cv-qualified version of the dynamic type of the object,

— a type similar (as defined in 4.4) to the dynamic type of the object,

— a type that is the signed or unsigned type corresponding to the dynamic type of the object,

— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

— an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

— a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

— a char or unsigned char type.

§3.10 [basic.lval]

The types of the objects in this case are unsigned char , therefore attempting to access them through a bool lvalue (which is what you obtain by dereferencing a bool * ) leads to UB.

This violates the strict aliasing rule , I will quote my answer here :

the strict aliasing rules which makes it illegal to access an object through a pointer of a different type, although access through a char * is allowed. The compiler is allowed to assume that pointers of different types do not point to the same memory and optimize accordingly. It also means the code invokes undefined behavior and could really do anything.

The draft standard covers this in section 3.10 Lvalues and rvalues paragraph 10 :

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: 52

— the dynamic type of the object,

— a cv-qualified version of the dynamic type of the object,

— a type similar (as defined in 4.4) to the dynamic type of the object,

— a type that is the signed or unsigned type corresponding to the dynamic type of the object,

— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

— an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

— a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

— a char or unsigned char type.

If this was not a problem it is not clear, that attempting to interpret a bool as a char is even valid, section 3.9.1 Fundamental types says:

Values of type bool are either true or false.[...]

where footnote 47 says:

Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM