简体   繁体   English

如何从C中安全地读取数据包中的数据?

[英]How to securely read data from a packet in C?

How can I read data from a packet in C and convert it into a structure? 如何从C中读取数据包中的数据并将其转换为结构? I mean, there's a structure like 我的意思是,有一个类似的结构

|=======================================================================
|0123456701234567012345670123456701234567012345670123456701234567.......
|  type  |             length            |    MSG HDR    |    data

into a struct like 进入类似的结构

struct msg {
  char type;
  size_t length;
  int hdr;
  struct data * data;
};

Is the following code fine? 以下代码是否正常?

bool parse_packet(char * packet, size_t packet_len, struct msg * result) {
    if(packet_len < 5) return false;
    result->type = *packet++;
    result->length = ntohl(*(int*)packet);
    packet+=4;
    if(result->length + 4 + 5 > packet_len)
      return false;
    if(result->length < 2)
      return false;
    result->hdr = ntohs(*(short*)packet);
    packet+=2;
    return parse_data(result, packet);
}

It's usually good practice to check that packet and result are non-null. 检查packetresult是否为空是通常的好习惯。

Why are you checking that packet_len < 5 when the header is 7 bytes? 当标头是7个字节时,为什么要检查packet_len < 5 Why not just ensure that the packet is at least 7 bytes and get it over with? 为什么不确保数据包至少为7个字节并将其结束? Or is hdr not present for some type ? 或者某些type hdr不存在?

I'm not sure what you're trying to achieve with 我不确定你想要实现的目标

if(result->length + 4 + 5 > packet_len)
    result->hdr = ntohs(*(short*)packet);
packet+=2;

If the declared message length plus nine is greater than the received message length, you read another two bytes from the message. 如果声明的消息长度加上9大于接收的消息长度,则从消息中读取另外两个字节。 Then regardless of the length of the data, you add two to the pointer and try to parse something out of it. 然后,无论数据的长度如何,您都要向指针添加两个并尝试解析其中的某些内容。 What if packet_len is 5 and result->length is 4294967295? 如果packet_len为5且result->length为4294967295怎么办? You're going to read off the end of your buffer, just like in Heartbleed. 你将读取缓冲区的末尾,就像在Heartbleed中一样。 You need to always verify that your reads are in bounds, and never trust the size declared in the packet. 您需要始终验证您的读取是否在边界内,并且永远不要信任数据包中声明的大小。

You have a completely standard situation. 你有一个完全标准的情况。 There's nothing deep or surprising here. 这里没什么深刻或令人惊讶的。

Start with a specification of the wire format. 从有线格式的规范开始。 You can use pseudo-code or actual C types for that, but the implication is that the data is packed into bytes on the wire: 您可以使用伪代码或实际C类型,但暗示数据在线路上打包成字节:

struct Message  // wire format, pseudo code
{
    uint8_t    type;
    uint32_t   length;      // big-endian on the wire
    uint8_t    header[2];
    uint8_t    data[length];
};

Now start parsing: 现在开始解析:

// parses a Message from (buf, size)
// precondition: "buf" points to "size" bytes of data; "msg" points to Message
// returns true on success
// msg->data is malloc()ed and contains the data on success
bool parse_message(unsigned char * buf, std::size_t size, Message * msg)
{
    if (size < 7) { return false; }

    // parse length
    uint32_t n;
    memcpy(&n, buf + 1, 4);
    n = ntohl(n);            // convert big-endian (wire) to native

    if (n > SIZE_MAX - 7)
    {
        // this is an implementation limit!
        return false;
    }

    if (size != 7 + n) { return false; }

    // copy data
    unsigned char * p = malloc(n);
    if (!p) { return false; }
    memcpy(p, buf + 7, n);

    // populate result
    msg->type = buf[0];
    msg->length = n;
    msg->header[0] = buf[5];
    msg->header[1] = buf[6];
    msg->data = p;

    return true;
}

An alternative way of parsing the length is like this, directly: 解析长度的另一种方法是直接:

uint32_t n = (buf[1] << 24) + (buf[2] << 16) + (buf[1] << 8) + (buf[0]);

This code assumes that buf contains exactly one message. 此代码假定buf 包含一个消息。 If you're taking messages off of a stream, you need to modify the code (namely the if (size != 7 + n) ) to check if there is at least as much data available as required, and return the amount of consumed data, too, so the caller can advance their stream position accordingly. 如果您正在从流中接收消息,则需要修改代码(即if (size != 7 + n) )以检查是否至少有所需数据可用,并返回消耗的数量数据也是如此,因此呼叫者可以相应地提前他们的流位置。 (The caller could in this case compute the amount of data that was parsed as msg->length + 7 , but relying on that is not scalable.) (在这种情况下,调用者可以计算被解析为msg->length + 7的数据量,但依赖msg->length + 7数据是不可扩展的。)

Note: As @user points out, if your size_t is not wider than uint32_t , then this implementation will wrongly reject very large messages. 注意:正如@user指出的那样,如果你的size_t不比uint32_t宽,那么这个实现将错误地拒绝非常大的消息。 Specifically, messages for which it is not true that 7 + n > n will be rejected. 具体而言, 7 + n > n将被拒绝。 I included a dynamic check for this (unlikely) condition. 我为这个(不太可能的)条件包含了一个动态检查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM