[英]Convert char buffer to struct
I have a char
buffer buf
containing buf[0] = 10
, buf[1] = 3
, buf[2] = 3
, buf[3] = 0
, buf[4] = 58
, 我有一个char
缓冲区buf
包含buf[0] = 10
, buf[1] = 3
, buf[2] = 3
, buf[3] = 0
, buf[4] = 58
,
and a structure: 和一个结构:
typedef struct
{
char type;
int version;
int length;
}Header;
I wanted to convert the buf
into a Header
. 我想将buf
转换为Header
。 Now I am using the function 现在我正在使用该功能
int getByte( unsigned char* buf)
{
int number = buf[0];
return number;
}
int getInt(unsigned char* buf)
{
int number = (buf[0]<<8)+buf[1];
return number;
}
main()
{
Header *head = new Header;
int location = 0;
head->type = getByte(&buf[location]);
location++; // location = 1
head->version = getInt(&buf[location]);
location += 2; // location = 3
head->ength = getInt(&buf[location]);
location += 2; // location = 5
}
I am searching for a solution such as 我正在寻找解决方案,例如
Header *head = new Header;
memcpy(head, buf, sizeof(head));
In this, first value in the Header
, head->type
is proper and rest is garbage. 在此, Header
, head->type
中的第一个值是正确的,而其余的是垃圾。 Is it possible to convert unsigned char* buf
to Header
? 是否可以将unsigned char* buf
转换为Header
?
The only full portable and secure way is: 唯一完整的便携式安全方式是:
void convertToHeader(unsigned char const * const buffer, Header *header)
{
header->type = buffer[0];
header->version = (buffer[1] << 8) | buffer[2];
header->length = (buffer[3] << 8) | buffer[4];
}
and 和
void convertFromHeader(Header const * const header, unsigned char * buffer)
{
buffer[0] = header->type;
buffer[1] = (static_cast<unsigned int>(header->version) >> 8) & 0xFF;
buffer[2] = header->version & 0xFF;
buffer[3] = (static_cast<unsigned int>(header->length) >> 8) & 0xFF;
buffer[4] = header->length & 0xFF;
}
see Converting bytes array to integer for explanations 请参阅将字节数组转换为整数以获取解释
EDIT 编辑
A quick summary of previous link: other possible solutions ( memcpy
or union
for example) are no portable according endianess of different system (doing what you do is probably for a sort of communication between at least two heterogeneous systems) => some of systems byte[0] is LSB of int and byte[1] is MSB and on other is the inverse. 以前的链接的简要概述:其他可能的解决方案(例如memcpy
或union
)根据不同系统的固有性是不可移植的(您所做的可能是至少两个异构系统之间的一种通信)=>一些系统字节[0]是int的LSB,而字节[1]是MSB,反之亦然。
Also, due to alignement, struct Header
can be bigger than 5 bytes (probably 6 bytes in your case, if alignement is 2 bytes!) (see here for example) 另外,由于对齐, struct Header
可以大于5个字节(如果对齐为2个字节,则可能大于6个字节!)(例如,请参见此处 )
Finally, according alignment restrictions and aliasing rules on some platform, compiler can generate incorrect code. 最后,根据某些平台上的对齐限制和别名规则,编译器可能会生成错误的代码。
What you want would need your version
and length
to have the same length as 2 elements of your buf
array; 您想要的内容将需要您的version
和length
与buf
数组的2个元素具有相同的长度; that is you'd need to use the type uint16_t
, defined in <cstdint>
, rather than int
which is likely longer. 那就是您需要使用在<cstdint>
定义的uint16_t
类型,而不是可能更长的int
类型。 And also you'd need to make buf
an array of uint8_t
, as char is allowed to take more than 1 byte! 而且您还需要使buf
成为uint8_t
数组,因为char可以占用1个以上字节!
You probably also need to move type
to the end; 您可能还需要将type
移到末尾。 as otherwise the compiler will almost certainly insert a padding byte after it to be able to align version
to a 2-byte boundary (once you have made it uint16_t
and thus 2 bytes); 否则,编译器几乎肯定会在其后插入一个填充字节,以便将version
与2字节边界对齐(一旦您将其设置为uint16_t
,即2字节); and then your buf[1]
would end up there rather than were you want it. 然后您的buf[1]
最终会在那里而不是您想要的。 This is probably what you observe right now, by the way: by having a char
followed by an int
, which is probably 4
bytes, you have 3
bytes of padding, and the elements 1
to 3
of your array are being inserted there (=lost forever). 顺便说一下,这可能就是您现在所观察到的:通过在char
后面加上一个int
(可能是4
个字节),您将有3
个字节的填充,并且您的数组元素1
到3
将插入其中(=永远失去了)。
Another solution would be to modify your buf
array to be longer and have empty padding bytes as well, so that the data will be actually aligned with the struct fields. 另一个解决方案是将buf
数组修改为更长,并且具有空的填充字节,以使数据实际与struct字段对齐。
Worth mentioning again is that, as pointed out in the comments, sizeof(head)
returns the size of pointers on your system, not of the Header
structure. 再次值得一提的是,正如注释中指出的那样, sizeof(head)
返回系统上指针的大小,而不是Header
结构的大小。 You can directly write sizeof(Header)
; 您可以直接写sizeof(Header)
; but at this level of micromanagement, you wont be losing any more flexibility if you just write " 5
", really. 但是在这种微观管理水平上,如果您只写“ 5
”,您将不会再失去更多的灵活性。
Also, endianness can screw with you. 此外,字节序可以与您联系。 Processors have no obbligation to store the bytes of a number in the order you expect rather than the opposite one; 处理器没有义务按照您期望的顺序而不是相反的顺序存储数字的字节。 both make internal sense after all. 毕竟,两者都有内在的道理。 This means that blindly copying bytes buf[0], buf[1]
into a number can result in (buf[0]<<8)+buf[1]
, but also in (buf[1]<<8)+buf[0]
, or even in (buf[1]<<24)+(buf[0]<<16)
if the data type is 4
bytes (as int
usually is). 这意味着将字节buf[0], buf[1]
盲目复制到数字中可能会导致(buf[0]<<8)+buf[1]
,但也会导致(buf[1]<<8)+buf[0]
,或者如果数据类型为4
个字节(通常为int
(buf[1]<<24)+(buf[0]<<16)
则甚至以(buf[1]<<24)+(buf[0]<<16)
表示。 And even if it works on your computer now, there is at least one out there where the same code will result in garbage. 即使它现在可以在您的计算机上运行,也至少有一个地方使用相同的代码会导致垃圾回收。 Unless, that is, those bytes actually come from reinterpreting a number in the first place. 除非是,否则这些字节实际上首先来自重新解释数字。 In which case the code is wrong (not portable) now , however. 但是,在这种情况下, 现在的代码是错误的(不可移植)。
...is it worth it? ...这值得么?
All things considered, my advice is strongly to keep the way you handle them now. 考虑到所有问题,我的建议是强烈保留您现在处理它们的方式。 Maybe simplify it. 也许简化一下。
It really makes no sense to convert a byte to an int then to byte again, or to take the address of a byte to dereference it again, nor there is need of helper variables with no descriptive name and no purpose other than being returned, or of a variable whose value you know in advance at all time. 将字节转换为int然后再次转换为字节,或者使用字节的地址再次对其进行取消引用确实没有任何意义,也不需要辅助变量,而该辅助变量没有描述性的名称,除了返回以外没有其他用途,或者您始终可以事先知道其值的变量。
Just do 做就是了
int getTwoBytes(unsigned char* buf)
{
return (buf[0]<<8)+buf[1];
}
main()
{
Header *head = new Header;
head->type = buf[0];
head->version = getTwoBytes(buf + 1);
head->length = getTwoBytes(buf + 3);
}
the better way is to create some sort of serialization/deserialization routines. 更好的方法是创建某种序列化/反序列化例程。
also, I'd use not just int
or char
types, but would use more specific int32_t
etc. it's just platform-independent way (well, actually you can also pack your data structures with pragma pack). 同样,我不仅会使用int
或char
类型,还会使用更具体的int32_t
等。这只是与平台无关的方式(嗯,实际上,您也可以使用pragma pack打包数据结构)。
struct Header
{
char16_t type;
int32_t version;
int32_t length;
};
struct Tools
{
std::shared_ptr<Header> deserializeHeader(const std::vector<unsigned char> &loadedBuffer)
{
std::shared_ptr<Header> header(new Header);
memcpy(&(*header), &loadedBuffer[0], sizeof(Header));
return header;
}
std::vector<unsigned char> serializeHeader(const Header &header)
{
std::vector<unsigned char> buffer;
buffer.resize(sizeof(Header));
memcpy(&buffer[0], &header, sizeof(Header));
return buffer;
}
}
tools;
Header header = {'B', 5834, 4665};
auto v1 = tools.serializeHeader(header);
auto v2 = tools.deserializeHeader(v1);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.