简体   繁体   English

将char缓冲区转换为struct

[英]Convert char buffer to struct

I have a char buffer buf containing buf[0] = 10 , buf[1] = 3 , buf[2] = 3 , buf[3] = 0 , buf[4] = 58 , 我有一个char缓冲区buf包含buf[0] = 10buf[1] = 3buf[2] = 3buf[3] = 0buf[4] = 58

and a structure: 和一个结构:

typedef struct
{ 
    char type;
    int version;
    int length;
}Header;

I wanted to convert the buf into a Header . 我想将buf转换为Header Now I am using the function 现在我正在使用该功能

int getByte( unsigned char* buf)
{
    int number = buf[0]; 
    return number;
}

int getInt(unsigned char* buf)
{
    int number =  (buf[0]<<8)+buf[1];
    return number;
}

main()
{
    Header *head = new Header;
    int location = 0;

    head->type = getByte(&buf[location]);
    location++;     // location = 1

    head->version = getInt(&buf[location]);
    location += 2;  // location = 3

    head->ength = getInt(&buf[location]);
    location += 2;  // location = 5 
}

I am searching for a solution such as 我正在寻找解决方案,例如

 Header *head = new Header;

 memcpy(head, buf, sizeof(head));

In this, first value in the Header , head->type is proper and rest is garbage. 在此, Headerhead->type中的第一个值是正确的,而其余的是垃圾。 Is it possible to convert unsigned char* buf to Header ? 是否可以将unsigned char* buf转换为Header

The only full portable and secure way is: 唯一完整的便携式安全方式是:

void convertToHeader(unsigned char const * const buffer, Header *header)
{
    header->type = buffer[0];
    header->version = (buffer[1] <<  8) | buffer[2];
    header->length = (buffer[3] <<  8) | buffer[4];
}

and

void convertFromHeader(Header const * const header, unsigned char * buffer)
{
    buffer[0] = header->type;
    buffer[1] = (static_cast<unsigned int>(header->version) >>  8) & 0xFF;
    buffer[2] = header->version & 0xFF;
    buffer[3] = (static_cast<unsigned int>(header->length) >>  8) & 0xFF;
    buffer[4] = header->length & 0xFF;
}

Example

see Converting bytes array to integer for explanations 请参阅将字节数组转换为整数以获取解释

EDIT 编辑

A quick summary of previous link: other possible solutions ( memcpy or union for example) are no portable according endianess of different system (doing what you do is probably for a sort of communication between at least two heterogeneous systems) => some of systems byte[0] is LSB of int and byte[1] is MSB and on other is the inverse. 以前的链接的简要概述:其他可能的解决方案(例如memcpyunion )根据不同系统的固有性是不可移植的(您所做的可能是至少两个异构系统之间的一种通信)=>一些系统字节[0]是int的LSB,而字节[1]是MSB,反之亦然。

Also, due to alignement, struct Header can be bigger than 5 bytes (probably 6 bytes in your case, if alignement is 2 bytes!) (see here for example) 另外,由于对齐, struct Header可以大于5个字节(如果对齐为2个字节,则可能大于6个字节!)(例如,请参见此处

Finally, according alignment restrictions and aliasing rules on some platform, compiler can generate incorrect code. 最后,根据某些平台上的对齐限制和别名规则,编译器可能会生成错误的代码。

What you want would need your version and length to have the same length as 2 elements of your buf array; 您想要的内容将需要您的versionlengthbuf数组的2个元素具有相同的长度; that is you'd need to use the type uint16_t , defined in <cstdint> , rather than int which is likely longer. 那就是您需要使用在<cstdint>定义的uint16_t类型,而不是可能更长的int类型。 And also you'd need to make buf an array of uint8_t , as char is allowed to take more than 1 byte! 而且您还需要使buf成为uint8_t数组,因为char可以占用1个以上字节!

You probably also need to move type to the end; 您可能还需要将type移到末尾。 as otherwise the compiler will almost certainly insert a padding byte after it to be able to align version to a 2-byte boundary (once you have made it uint16_t and thus 2 bytes); 否则,编译器几乎肯定会在其后插入一个填充字节,以便将version与2字节边界对齐(一旦您将其设置为uint16_t ,即2字节); and then your buf[1] would end up there rather than were you want it. 然后您的buf[1]最终会在那里而不是您想要的。 This is probably what you observe right now, by the way: by having a char followed by an int , which is probably 4 bytes, you have 3 bytes of padding, and the elements 1 to 3 of your array are being inserted there (=lost forever). 顺便说一下,这可能就是您现在所观察到的:通过在char后面加上一个int (可能是4个字节),您将有3个字节的填充,并且您的数组元素13将插入其中(=永远失去了)。

Another solution would be to modify your buf array to be longer and have empty padding bytes as well, so that the data will be actually aligned with the struct fields. 另一个解决方案是将buf数组修改为更长,并且具有空的填充字节,以使数据实际与struct字段对齐。

Worth mentioning again is that, as pointed out in the comments, sizeof(head) returns the size of pointers on your system, not of the Header structure. 再次值得一提的是,正如注释中指出的那样, sizeof(head)返回系统上指针的大小,而不是Header结构的大小。 You can directly write sizeof(Header) ; 您可以直接写sizeof(Header) ; but at this level of micromanagement, you wont be losing any more flexibility if you just write " 5 ", really. 但是在这种微观管理水平上,如果您只写“ 5 ”,您将不会再失去更多的灵活性。

Also, endianness can screw with you. 此外,字节序可以与您联系。 Processors have no obbligation to store the bytes of a number in the order you expect rather than the opposite one; 处理器没有义务按照您期望的顺序而不是相反的顺序存储数字的字节。 both make internal sense after all. 毕竟,两者都有内在的道理。 This means that blindly copying bytes buf[0], buf[1] into a number can result in (buf[0]<<8)+buf[1] , but also in (buf[1]<<8)+buf[0] , or even in (buf[1]<<24)+(buf[0]<<16) if the data type is 4 bytes (as int usually is). 这意味着将字节buf[0], buf[1]盲目复制到数字中可能会导致(buf[0]<<8)+buf[1] ,但也会导致(buf[1]<<8)+buf[0] ,或者如果数据类型为4个字节(通常为int (buf[1]<<24)+(buf[0]<<16)则甚至以(buf[1]<<24)+(buf[0]<<16)表示。 And even if it works on your computer now, there is at least one out there where the same code will result in garbage. 即使它现在可以在您的计算机上运行,​​也至少有一个地方使用相同的代码会导致垃圾回收。 Unless, that is, those bytes actually come from reinterpreting a number in the first place. 除非是,否则这些字节实际上首先来自重新解释数字。 In which case the code is wrong (not portable) now , however. 但是,在这种情况下, 现在的代码是错误的(不可移植)。

...is it worth it? ...这值得么?

All things considered, my advice is strongly to keep the way you handle them now. 考虑到所有问题,我的建议是强烈保留您现在处理它们的方式。 Maybe simplify it. 也许简化一下。

It really makes no sense to convert a byte to an int then to byte again, or to take the address of a byte to dereference it again, nor there is need of helper variables with no descriptive name and no purpose other than being returned, or of a variable whose value you know in advance at all time. 将字节转换为int然后再次转换为字节,或者使用字节的地址再次对其进行取消引用确实没有任何意义,也不需要辅助变量,而该辅助变量没有描述性的名称,除了返回以外没有其他用途,或者您始终可以事先知道其值的变量。

Just do 做就是了

int getTwoBytes(unsigned char* buf)
{
    return (buf[0]<<8)+buf[1];
}

main()
{
    Header *head = new Header;

    head->type = buf[0];

    head->version = getTwoBytes(buf + 1);

    head->length = getTwoBytes(buf + 3);
}

the better way is to create some sort of serialization/deserialization routines. 更好的方法是创建某种序列化/反序列化例程。

also, I'd use not just int or char types, but would use more specific int32_t etc. it's just platform-independent way (well, actually you can also pack your data structures with pragma pack). 同样,我不仅会使用intchar类型,还会使用更具体的int32_t等。这只是与平台无关的方式(嗯,实际上,您也可以使用pragma pack打包数据结构)。

    struct Header
    {
        char16_t type;
        int32_t version;
        int32_t length;
    };
    struct Tools
    {
        std::shared_ptr<Header> deserializeHeader(const std::vector<unsigned char> &loadedBuffer)
        {
            std::shared_ptr<Header> header(new Header);
            memcpy(&(*header), &loadedBuffer[0], sizeof(Header));
            return header;
        }
        std::vector<unsigned char> serializeHeader(const Header &header)
        {
            std::vector<unsigned char> buffer;
            buffer.resize(sizeof(Header));
            memcpy(&buffer[0], &header, sizeof(Header));
            return buffer;
        }
    }
    tools;
    Header header = {'B', 5834, 4665};
    auto v1 = tools.serializeHeader(header);
    auto v2 = tools.deserializeHeader(v1);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM