[英]The right way to work with network buffer in modern GCC/C++ without breaking strict-aliasing rules
The program - some sort of old-school network messaging: 该计划 - 某种旧式网络消息传递:
// Common header for all network messages.
struct __attribute__((packed)) MsgHeader {
uint32_t msgType;
};
// One of network messages.
struct __attribute__((packed)) Msg1 {
MsgHeader header;
uint32_t field1;
};
// Network receive buffer.
uint8_t rxBuffer[MAX_MSG_SIZE];
// Receive handler. The received message is already in the rxBuffer.
void onRxMessage() {
// Detect message type
if ( ((const MsgHeader*)rxBuffer)->msgType == MESSAGE1 ) { // Breaks strict-aliasing!
// Process Msg1 message.
const Msg1* msg1 = (const Msg1*)rxBuffer;
if ( msg1->field1 == 0 ) { // Breaks strict-aliasing!
// Some code here;
}
return;
}
// Process other message types.
}
This code violates strict-aliasing in modern GCC (and falls down to unspecified behaviour in modern C++). 此代码违反了现代GCC中的严格别名(并归结为现代C ++中未指定的行为)。 What is the correct way to solve the problem (to make the code that doesn't throw the "strict-aliasing" warning)?
解决问题的正确方法是什么(使代码不会引发“严格别名”警告)?
PS If rxBuffer is defined as: PS如果rxBuffer定义为:
union __attribute__((packed)) {
uint8_t[MAX_MSG_SIZE] rawData;
} rxBuffer;
and then I cast &rxBuffer to other pointers it doesn't cause any warnings. 然后我将&rxBuffer转换为其他指针它不会引起任何警告。 But is it safe, right and portable way?
但它是安全,正确和便携的方式吗?
Define rxBuffer
as a pointer to a union
of uint8_t[MAX_SIZE]
, MsgHeader
, Msg1
and whatever type you plan to cast to. 将
rxBuffer
定义为指向uint8_t[MAX_SIZE]
, MsgHeader
, Msg1
以及您计划转换为的任何类型的并union
的uint8_t[MAX_SIZE]
。 Note that this would still break the strict aliasing rules, but in GCC it it guaranteed to work as non-standard extension. 请注意,这仍然会破坏严格的别名规则,但在GCC中它保证可以作为非标准扩展。
EDIT: if such a method would lead to a too complicated declaration, a fully portable (if slower) way is to keep the buffer as a simple uint8_t[]
and memcpy
it to the opportune message struct as soon as it has to be reinterpreted. 编辑:如果这样的方法会导致过于复杂的声明,那么完全可移植(如果较慢)的方法是将缓冲区保持为简单的
uint8_t[]
,并在必须重新解释时将其memcpy
到memcpy
的消息struct。 The feasability of this method obviously depends on your performance and efficiency needs. 这种方法的可行性显然取决于您的性能和效率需求。
EDIT 2: a third solution (if you are working on "normal" architectures) is to use
char
or
unsigned char
instead of
uint8_t
.
编辑2:第三种解决方案(如果您正在使用“普通”架构)是使用
char
或
unsigned char
而不是
uint8_t
。
Such types are guaranteed to alias everything.
这些类型保证为所有内容添加别名。
Not valid because the conversion to the message type might not work, see here 无效,因为转换为消息类型可能不起作用,请参见此处
By working with the individual bytes, you can avoid all pointer casting and eliminate portability issues with endianness and alignment: 通过使用单个字节,您可以避免所有指针转换并消除字节顺序和对齐的可移植性问题:
uint32_t decodeUInt32(uint8_t *p) {
// Decode big-endian, which is network byte order.
return (uint32_t(p[0])<<24) |
(uint32_t(p[1])<<16) |
(uint32_t(p[2])<< 8) |
(uint32_t(p[3]) );
}
void onRxMessage() {
// Detect message type
if ( decodeUInt32(rxBuffer) == MESSAGE1 ) {
// Process Msg1 message.
if ( decodeUInt32(rxBuffer+4) == 0 ) {
// Some code here;
}
return;
}
// Process other message types.
}
Like Alberto M wrote, you can change the type of your buffer and how you receive into it: 就像Alberto M写的那样,您可以更改缓冲区的类型以及接收方式:
union { uint8_t rawData[MAX_MSG_SIZE]; struct MsgHeader msgHeader; struct { struct MsgHeader dummy; struct Msg1 msg; } msg1; } rxBuffer; receiveBuffer(&rxBuffer.rawData); if (rxBuffer.msgHeader.msgType == MESSAGE1) { if (rxBuffer.msg1.msg.field1) { // ...
or directly receive into the struct, if your receive uses char
s ( uint8_t
only aliases uint8_t
unlike char
, which may always alias): 或直接接收到结构中,如果你的接收使用
char
( uint8_t
只别名uint8_t
不像char
,它可能总是别名):
struct { struct MsgHeader msgHeader; union { struct Msg1 msg1; struct Msg2 msg2; } msg; } rxBuffer; recv(fd, (char *)&rxBuffer, MAX_MSG_SIZE, 0); // handle errors and insufficient recv length if (rxBuffer.msgHeader.msgType == MESSAGE1) { // ...
Btw.
顺便说一句。
type punning through a union
is standard and doesn't break strict aliasing.
通过联合打字
是标准的,不会破坏严格的别名。
See
C99-TC3 6.5 (7) and also search for "type punning".
参见
C99-TC3 6.5(7)并搜索“打字”。
The question is about C++, but not C, so Alberto M is right about it being non-standard, but a GCC extension. 问题是关于C ++,但不是C,所以Alberto M认为它是非标准的,而是GCC扩展。
Using memcpy
for this works kind of in the same manner like above, but is standard: bytes are copied on per character basis, effectively reinterpreting them as a struct when accessing the destination location, like you would do when you're type punning through a union: 使用
memcpy
可以像上面一样工作,但是是标准的:字节是基于每个字符复制的,在访问目标位置时有效地将它们重新解释为结构,就像你在通过联盟:
struct MsgHeader msgHeader; memcpy(&msgHeader, rxBuffer, sizeof(msgHeader)); if (msg_header.msgType == MESSAGE1) { struct Msg1 msg; memcpy(&msg, rxBuffer + sizeof(msgHeader), sizeof(msg)); if (msg.field1 == 0) { // Some code here; } }
Or like Vaughn Cato wrote, you can unpack (and should then probably also pack) the received and sent network buffers yourself. 或者像Vaughn Cato写的那样,您可以自己解压缩(并且应该也可以打包)接收和发送的网络缓冲区。 Again it's standard compliant and this way you also work around padding and byte order in a portable way:
它再次符合标准,这样你也可以通过便携方式解决填充和字节顺序:
uint8_t *buf= rxBuffer; struct MsgHeader msgHeader; msgHeader.msgType = (buf[3]<<0) | (buf[2]<<8) | (buf[1]<<16) | (buf[0]<<24); // read uint32_t in big endian if (msgHeader.msgType == MESSAGE2) { struct Msg2 msg; buf += sizeof(MsgHeader); msg.field1 = (buf[1]<<0) | (buf[0]<<8); // read uint16_t in big endian if (msg.field1 == 0) { // ...
Note: struct Msg1
and struct Msg2
don't contain a struct MsgHeader
in the above snippets and are like this: 注意:
struct Msg1
和struct Msg2
在上面的代码片段中不包含struct MsgHeader
,如下所示:
struct Msg1 {
uint32_t field1;
};
struct Msg2 {
uint16_t field1;
};
It boils down to this: 归结为:
((const MsgHeader*)rxBuffer)->msgType
rxBuffer
is of one type, but we wish to treat is as if it was of another type. rxBuffer
是一种类型,但我们希望将其视为另一种类型。 I suggest the following "alias-cast": 我建议使用以下“alias-cast”:
const MsgHeader * msg_header_p = (const MsgHeader *) rxBuffer;
memmove(msg_header_p, rxBuffer, sizeof(MsgHeader));
auto msg_type = msg_header_p -> msgType;
memmove
(like its less flexible cousin memcpy
) effectively says that the bit pattern that was available at the source ( rxBuffer
) will, after the call to memmove
be available at the destination ( msg_header_p
). memmove
(就像它不太灵活表弟memcpy
)有效地说,这是可在源(位模式rxBuffer
)会,后调用memmove
可在目的地( msg_header_p
)。 Even if the types are different. 即使类型不同。
You might argue that memmove
does "nothing", because the source and destination are identical. 您可能会认为
memmove
“没有”,因为源和目标是相同的。 But that's exactly the point. 但这正是重点。 Logically , it serves the purpose of making
msg_header_p
an alias for rxBuffer
, even though in practice a good compiler will optimize it out. 从逻辑上讲 ,它使服务的目的
msg_header_p
的别名rxBuffer
,尽管在实践中一个好的编译器会优化它。
(This answer is potentially a bit controversial. I may be pushing memmove
too far. I guess my logic is: First, memcpy
to a new location is clearly acceptable to answer this question; second, memmove
is just a better, more general (but maybe slower), version of memcpy
; third, if memcpy
allows you to look at the same bit pattern via a different type, when why shouldn't memmove
allow the same idea to "change" the type of a particular bit pattern? If we memcpy
to a temporary area, then memcpy
back to the original position, would be OK also? ) (这个答案可能有点争议。我可能会把
memmove
推得太远。我猜我的逻辑是:首先,回到新位置的memcpy
显然可以回答这个问题;其次, memmove
只是更好,更通用(但是也许更慢), memcpy
版本;第三,如果memcpy
允许你通过不同的类型查看相同的位模式,为什么不应该memmove
允许相同的想法“改变”特定位模式的类型?如果我们memcpy
到临时区域,然后memcpy
回到原来的位置,也可以吗?)
If you want to build a full answer out of this, you'll need to alias-cast back again at some point, memmove(rxBuffer, msg_header_p, sizeof(MsgHeader));
如果你想构建一个完整的答案,你需要在某个时候再次使用别名 -
memmove(rxBuffer, msg_header_p, sizeof(MsgHeader));
, but I guess I should await feedback on my "alias cast" first! ,但我想我应该先等待我的“别名演员”的反馈!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.