简体   繁体   中英

Allocating contiguous memory to contain multiple structs with flexible array members

Consider a struct containing a flexible array member like the following:

typedef struct {
    size_t len;
    char data[];
} Foo;

I have an unknown number of Foos, each of which is an unknown size, however I can be certain that all of my Foos together will total exactly 1024 bytes. How can I allocate 1024 bytes for an array of Foos before knowing the length of each Foo, and then fill in the members of the array later?

Something like this, although it throws a segfault:

Foo *array = malloc(1024);
int array_size = 0;

Foo foo1;
strcpy(foo1.data, "bar");
array[0] = foo1;
array_size++;

Foo foo2;
strcpy(foo2.data, "bar");
array[1] = foo2;
array_size++;

for (int i = 0; i < array_size; i++)
    puts(array[i].data);

The reason for wanting to do this is to keep all the Foos in a contiguous memory region, for CPU cache friendliness.

You can't have an array of foos, at all, because foo does not have a fixed size, and the defining characteristic of an array is that each object has fixed size and offset from the base computable from its index. For what you want to work, indexing array[n] would have to know the full size of foo[0] , foo[1] , ..., foo[n-1] , which is impossible, because the language has no knowledge of those sizes; in practice, the flexible array member is just excluded from the size, so foo[1] will "overlap" with foo[0] 's data.

If you need to be able to access these objects as an array, you need to give up on putting a flexible array member in each one. Instead you could put all the data at the end, and store a pointer or offset to the data in each one. If you don't need to be able to access them as an array, you could instead build a sort of linked list in the allocated memory, storing an offset to the next entry as a member of each entry. (See for example how struct dirent works with getdents on most Unices.)

As others have noted, you cannot have a C array of Foo . However, suppose you are willing to store them irregularly and just need to know how much space could be required. This answer shows that.

Let N be the number of Foo objects there are.

Let S be sizeof(Foo) , which is the size of a Foo object with zero bytes for data .

Let A be _Alignof(Foo) .

Every Foo object must be start on an address aligned to A bytes. Let this be A . The worst case for padding is that the data array is one byte, requiring that A −1 bytes be skipped before the start of the next Foo .

Therefore, in addition the 1024 bytes consumed by the Foo objects (including their data ), we might need ( N −1)•( A −1) bytes for this padding. (The N −1 is because no padding bytes are needed after the last Foo .)

If each Foo has at least one byte of data , the most N could be is floor(1024/( S +1)), because we know that all of the Foo objects and their data use at most 1024 bytes.

Therefore 1024 + floor(1024/( S +1)−1)*( A −1) bytes suffice—1024 bytes for the actual data and floor(1024/( S +1)−1)*( A −1) for the padding.

Note that the above assumes each Foo has at least one byte of data . If one or more Foo have zero bytes of data , N could be larger than floor(1024/( S +1)). However, after any such Foo , no padding is needed, and N cannot increase by more than one for each such Foo (because reducing the space used by one byte cannot make more for more than one Foo ). Thus, such a Foo could gives us one more Foo elsewhere that needs A −1 bytes of padding, but it itself does not need padding, so the total amount of padding needed cannot increase.

So, a plan to assign memory for the Foo objects is:

  • Allocate 1024 + floor(1024/( S +1)−1)*( A −1) bytes.
  • Put the first Foo at the start of the allocated memory.
  • Put each successive Foo at the next A -aligned address after the end of the previous Foo (including its data ).

This will not yield an array, of course, just a mass of Foo objects within the allocated space. You will need pointers or other means of addressing them.

Per C 2018 7.22.3.4 2:

The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.

So, chopping up the space returned by malloc in an irregular way to use for multiple objects is not a good fit for that specification. I will leave that for others to discuss, but I have not observed a C implementation have a problem with it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM