简体   繁体   中英

c++: union -> struct, explanation?

I have the following type definition:

typedef union{
unsigned int Entry;
struct {
    unsigned char EntryType;
    unsigned char EntryOffset[3];
};
} TLineDescriptor;

I also have the following use of the type:

TLineDescriptor LineDescriptor;
LineDescriptor.Entry = 40;
LineDescriptor.EntryType = 0x81;

sizeof(LineDescriptor) shows that this variable occupies 4 bytes of memory, which at first I assumed that it either held the int or the struct.

cout << LineDescriptor.Entry << " " << LineDescriptor.EntryType << endl;

However, the line above prints two different values, namely 129 ü , LineDescriptor.Entry is apparently referring to the memory location where the value 0x81 was saved. I'm not sure what happened with the 40. But it is clear that my assumption was wrong. Can someone interpret and explain the type definition properly? Understanding it is crucial for me to working with the code I found.

Thank you in advance.

Print EntryType in this way:

cout << "0x" << hex << (unsigned)LineDescriptor.EntryType << endl;

and you will see that ü is 0x81.

Printing this:

cout << LineDescriptor.Entry

is undefined behavior - because only one element in union can be "active" at a moment - and your last assignment was to EntryType.

However, assuming we can assume that this is not actually as undefined as C++ wish it to be, then 129 is from:

Entry=40 - which is in binary format on your system 28 00 00 00 (less significant byte first).

With LineDescriptor.EntryType = 0x81; you changed first byte: 81 00 00 00 - so your printout for Entry is now 129.

Make this experiment and you get other result:

TLineDescriptor LineDescriptor;

LineDescriptor.Entry = 256;
LineDescriptor.EntryType = 0x81;

cout << LineDescriptor.Entry << " " << unsigned(LineDescriptor.EntryType) 
     << endl;
>> 385  129

These are not different values, actually. 129 is the character code for the character ü . The operator << of the ostream treats int and char data types differently, printing the numerical value for the first one and the character value for the latter one.

So, your understanding of union types are correct. However, note that endianness could be an issue when dealing with union types. For example, on little-endian machines EntryType will hold the least significant byte of Entry and the EntryOffset array the others. But on big-endian machines, EntryType will hold the most significant byte.

Your assumption is not wrong, the union will hold either the int or the struct. When you assign the value 0x81 to the EntryType field, the integer you previously assigned to Entry will be overwritten, which is why when you cout both fields, you get the same quantity appearing for both, one as an int (129) and one as a char (ü). Both have the hex value 0x81 .

It holds an int and a struct at the same time, and both of the occupy the same memory space. By accessing TLineDescriptor::Entry , you interpret those 4 bytes as the int . If you access it through the struct , you interpret it as 4 unsigned char s.

LineDescriptor.Entry = 40;

This sets the 4 bytes to an int value of 40. In small endian system this means that first byte is 40, the other 3 bytes are 0.

LineDescriptor.EntryType = 0x81;

This sets the first byte to value 129 (0x81). (In small endian system this means, that the value of Entry is now 129 as well, provided you have the rest set to 0).

Regarding the different output: when you output the EntryType, it is displayed as a character instead of number. Try:

cout << LineDescriptor.Entry << " " << static_cast<int>(LineDescriptor.EntryType) << endl;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM