简体   繁体   中英

Portable way to find size of a packed structure in C

I'm coding a network layer protocol and it is required to find a size of packed a structure defined in C. Since compilers may add extra padding bytes which makes sizeof function useless in my case. I looked up Google and find that we could use ___attribute(packed)___ something like this to prevent compiler from adding extra padding bytes. But I believe this is not portable approach, my code needs to support both windows and linux environment.

Currently, I've defined a macro to map packed sizes of every structure defined in my code. Consider code below:

typedef struct {
...
} a_t;

typedef struct {
...
} b_t;

#define SIZE_a_t 8;
#define SIZE_b_t 10;

#define SIZEOF(XX) SIZE_##XX;

and then in main function, I can use above macro definition as below:-

int size = SIZEOF(a_t);

This approach does work, but I believe it may not be best approach. Any suggestions or ideas on how to efficiently solve this problem in C?

Example

Consider the C structure below:-

typedef struct {
   uint8_t  a;
   uint16_t b;
} e_t;

Under Linux, sizeof function return 4 bytes instead of 3 bytes. To prevent this I'm currently doing this:-

typedef struct {
   uint8_t  a;
   uint16_t b;
} e_t;

#define SIZE_e_t 3
#define SIZEOF(XX) SIZE_##e_t

Now, when I call SIZEOF(e_t) in my functin, it should return 3 not 4.

sizeof is the portable way to find the size of a struct, or of any other C data type.

The problem you're facing is how to ensure that your struct has the size and layout that you need.

#pragma pack or __attribute__((packed)) may well do the job for you. It's not 100% portable (there's no mention of packing in the C standard), but it may be portable enough for your current purposes, but consider whether your code might need to be ported to some other platform in the future. It's also potentially unsafe; see this question and this answer .

The only 100% portable approach is to use arrays of unsigned char and keep track of which fields occupy which ranges of bytes. This is a lot more cumbersome, of course.

Your macro tells you the size that you think the struct should have, if it has been laid out as you intend.

If that's not equal to sizeof(a_t) , then whatever code you write that thinks it is packed isn't going to work anyway. Assuming they're equal, you might as well just use sizeof(a_t) for all purposes. If they're not equal then you should be using it only for some kind of check that SIZEOF(a_t) == sizeof(a_t) , which will fail and prevent your non-working code from compiling.

So it follows that you might as well just put the check in the header file that sizeof(a_t) == 8 , and not bother defining SIZEOF .

That's all aside from the fact that SIZEOF doesn't really behave like sizeof . For example consider typedef a_t foo; sizeof(foo); typedef a_t foo; sizeof(foo); , which obviously won't work with SIZEOF .

I don't think, that specifying size manually is more portable, than using sizeof.

If size is changed your const-specified size will be wrong.

Attribute packed is portable. In Visual Studio it is #pragma pack .

I would recommend against trying to read/write data by overlaying it on a struct. I would suggest instead writing a family of routines which are conceptually like printf/scanf, but which use format specifiers that specify binary data formats. Rather than using percent-sign-based tags, I would suggest simply using a binary encoding of the data format.

There are a few approaches one could take, involving trade-off between the size of the serialization/deserialization routines themselves, the size of the code necessary to use them, and the ability to handle a variety of deserialization formats. The simplest (and most easily portable) approach would be to have routines which, instead of using a format string, process items individually by taking a double-indirect pointer, read some data type from it, and increment it suitably. Thus:

uint32_t read_uint32_bigendian(uint8_t const ** src)
{
  uint8_t *p;
  uint32_t tmp;

  p = *src;
  tmp = (*p++) << 24;
  tmp |= (*p++) << 16;
  tmp |= (*p++) << 8;
  tmp |= (*p++);
  *src = p;
}

...
  char buff[256];
...
  uint8_t *buffptr = buff;
  first_word = read_uint32_bigendian(&buffptr);
  next_word = read_uint32_bigendian(&buffptr);

This approach is simple, but has the disadvantage of having lots of redundancy in the packing and unpacking code. Adding a format string could simplify it:

#define BIGEND_INT32 "\x43"  // Or whatever the appropriate token would be
  uint8_t *buffptr = buff;
  read_data(&buffptr, BIGEND_INT32 BIGEND_INT32, &first_word, &second_word);

This approach could read any number of data items with a single function call, passing buffptr only once, rather than once per data item. On some systems, it might still be a bit slow. An alternative approach would be to pass in a string indicating what sort of data should be received from the source, and then also pass in a string or structure indicating where the data should go. This could allow any amount of data to be parsed by a single call giving a double-indirect pointer for the source, a string pointer indicating the format of data at the source, a pointer to a struct indicating how the data should be unpacked, and aa pointer to a struct to hold the target data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM