How can I read data from a packet in C and convert it into a structure? I mean, there's a structure like
|=======================================================================
|0123456701234567012345670123456701234567012345670123456701234567.......
| type | length | MSG HDR | data
into a struct like
struct msg {
char type;
size_t length;
int hdr;
struct data * data;
};
Is the following code fine?
bool parse_packet(char * packet, size_t packet_len, struct msg * result) {
if(packet_len < 5) return false;
result->type = *packet++;
result->length = ntohl(*(int*)packet);
packet+=4;
if(result->length + 4 + 5 > packet_len)
return false;
if(result->length < 2)
return false;
result->hdr = ntohs(*(short*)packet);
packet+=2;
return parse_data(result, packet);
}
It's usually good practice to check that packet
and result
are non-null.
Why are you checking that packet_len < 5
when the header is 7 bytes? Why not just ensure that the packet is at least 7 bytes and get it over with? Or is hdr
not present for some type
?
I'm not sure what you're trying to achieve with
if(result->length + 4 + 5 > packet_len)
result->hdr = ntohs(*(short*)packet);
packet+=2;
If the declared message length plus nine is greater than the received message length, you read another two bytes from the message. Then regardless of the length of the data, you add two to the pointer and try to parse something out of it. What if packet_len
is 5 and result->length
is 4294967295? You're going to read off the end of your buffer, just like in Heartbleed. You need to always verify that your reads are in bounds, and never trust the size declared in the packet.
You have a completely standard situation. There's nothing deep or surprising here.
Start with a specification of the wire format. You can use pseudo-code or actual C types for that, but the implication is that the data is packed into bytes on the wire:
struct Message // wire format, pseudo code
{
uint8_t type;
uint32_t length; // big-endian on the wire
uint8_t header[2];
uint8_t data[length];
};
Now start parsing:
// parses a Message from (buf, size)
// precondition: "buf" points to "size" bytes of data; "msg" points to Message
// returns true on success
// msg->data is malloc()ed and contains the data on success
bool parse_message(unsigned char * buf, std::size_t size, Message * msg)
{
if (size < 7) { return false; }
// parse length
uint32_t n;
memcpy(&n, buf + 1, 4);
n = ntohl(n); // convert big-endian (wire) to native
if (n > SIZE_MAX - 7)
{
// this is an implementation limit!
return false;
}
if (size != 7 + n) { return false; }
// copy data
unsigned char * p = malloc(n);
if (!p) { return false; }
memcpy(p, buf + 7, n);
// populate result
msg->type = buf[0];
msg->length = n;
msg->header[0] = buf[5];
msg->header[1] = buf[6];
msg->data = p;
return true;
}
An alternative way of parsing the length is like this, directly:
uint32_t n = (buf[1] << 24) + (buf[2] << 16) + (buf[1] << 8) + (buf[0]);
This code assumes that buf
contains exactly one message. If you're taking messages off of a stream, you need to modify the code (namely the if (size != 7 + n)
) to check if there is at least as much data available as required, and return the amount of consumed data, too, so the caller can advance their stream position accordingly. (The caller could in this case compute the amount of data that was parsed as msg->length + 7
, but relying on that is not scalable.)
Note: As @user points out, if your size_t
is not wider than uint32_t
, then this implementation will wrongly reject very large messages. Specifically, messages for which it is not true that 7 + n > n
will be rejected. I included a dynamic check for this (unlikely) condition.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.