简体   繁体   中英

Struct one-byte alignment conflicted with alignment requirement of the architecture?

I previously posted a question here about aligned access during pointer casting. As a summary, it's better not to use unaligned access to be fully portable because some architecture may throw an exception, or the performance may get quite slower compared to aligned access.

However, there are cases that I want to use one-byte alignment, eg, during transferring network data, I don't want adding extra padding inside structure. So usually what's done here is:

#pragma pack (push, 1)
struct tTelegram
{
   u8 cmd;
   u8 index;
   u16 addr1_16;
   u16 addr2_16;
   u8  length_low;
   u8 data[1];
};
#pragma pack (pop)

Then you might already know my question: If I enforce one-byte alignment on my struct, does that mean it cannot be fully portable, because struct members are not aligned? What if I want both no padding and portability?

Firstly, misaligned memory accesses refers to single pieces of data that span multiple words in memory. For example: on a 32-bit system a 32-bit int at address 0, 4, 8 etc is aligned, but at 1, 2, 3, 5, 6, 7, 9 etc. would be misaligned.

Secondly, misaligned data doesn't "throw an exception" in the C++ sense, but may raise an interrupt/trap/exception at the CPU level - eg SIGBUS on UNIX, where you'd generally set a signal handler to react to this, but if you need to parse misaligned data in a portable way you wouldn't do so by catching signals - you'd manually code the steps to pack and unpack data spanning word boundaries.

In your tTelegram struct, the data is not "misaligned", but the process of bit shifting and masking the data as it's packed/unpacked from a register is still likely slower - requiring more machine code instructions - than using data that occupies an independent word.

Regarding portability - all non-toy compilers will have an option to pack in the way you've described, but the exact pragma will vary, the layout of bytes in multi-byte values may still be big-endian or little-endian (or something plain weird), and while some CPUs allow some misaligned data access (eg x86) others don't (eg Ultrasparc).

When transferring data between different computers you always want to format you data. Note, that a data format doesn't have to be readable but it can very well be binary. A binary format would included the exact position of each data item, its type, for multi-byte data the order the bytes appear, the size or a way to determine the size, etc. Not using a defined format will bite, probably sooner than later.

Put differently, although I have seen approaches as you describe used, I don't think they are normal and they are certainly not normal when it comes to defined format between different entities (companies for sure, probably also between different departments and/or groups). In the places where I worked for receiving and sending data the exact format was certainly defined. If the defined format can be matched with the data layout in a struct it is certainly also used to decode the data but it is known not to be portable and code meant to be portable doesn't attempt to use facilities like this. Instead it uses something which read/writes the relevant records and decodes/encodes the different appropriately. Often the decoding/encoding code is generated from some sort of meta format describing the exact data layout.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM