简体   繁体   English

分配连续内存以包含具有灵活数组成员的多个结构

[英]Allocating contiguous memory to contain multiple structs with flexible array members

Consider a struct containing a flexible array member like the following: 考虑一个包含灵活数组成员的结构,如下所示:

typedef struct {
    size_t len;
    char data[];
} Foo;

I have an unknown number of Foos, each of which is an unknown size, however I can be certain that all of my Foos together will total exactly 1024 bytes. 我的Foos数量未知,每个的大小都未知,但是我可以确定所有的Foos总共总共为1024个字节。 How can I allocate 1024 bytes for an array of Foos before knowing the length of each Foo, and then fill in the members of the array later? 在知道每个Foo的长度之前,如何在一个Foos数组中分配1024个字节,然后稍后填充该数组的成员?

Something like this, although it throws a segfault: 像这样的东西,尽管它抛出了段错误:

Foo *array = malloc(1024);
int array_size = 0;

Foo foo1;
strcpy(foo1.data, "bar");
array[0] = foo1;
array_size++;

Foo foo2;
strcpy(foo2.data, "bar");
array[1] = foo2;
array_size++;

for (int i = 0; i < array_size; i++)
    puts(array[i].data);

The reason for wanting to do this is to keep all the Foos in a contiguous memory region, for CPU cache friendliness. 这样做的原因是为了将所有Foos都保留在一个连续的内存区域中,以保持CPU缓存的友好性。

You can't have an array of foos, at all, because foo does not have a fixed size, and the defining characteristic of an array is that each object has fixed size and offset from the base computable from its index. 根本不能拥有foos数组,因为foo没有固定的大小,并且数组的定义特征是每个对象都具有固定的大小和相对于可从其索引计算的基础的偏移量。 For what you want to work, indexing array[n] would have to know the full size of foo[0] , foo[1] , ..., foo[n-1] , which is impossible, because the language has no knowledge of those sizes; 对于您想要的工作,索引array[n]必须知道foo[0]foo[1] ,..., foo[n-1]的完整大小,这是不可能的,因为该语言没有这些大小的知识; in practice, the flexible array member is just excluded from the size, so foo[1] will "overlap" with foo[0] 's data. 实际上,灵活数组成员只是从大小中排除,因此foo[1]将与foo[0]的数据“重叠”。

If you need to be able to access these objects as an array, you need to give up on putting a flexible array member in each one. 如果需要能够以数组的形式访问这些对象,则需要放弃在每个对象中放置一个灵活的数组成员。 Instead you could put all the data at the end, and store a pointer or offset to the data in each one. 相反,您可以将所有数据放在最后,并在每个数据中存储一个指向数据的指针或偏移量。 If you don't need to be able to access them as an array, you could instead build a sort of linked list in the allocated memory, storing an offset to the next entry as a member of each entry. 如果不需要以数组形式访问它们,则可以在分配的内存中构建一种链表,将到下一个条目的偏移量存储为每个条目的成员。 (See for example how struct dirent works with getdents on most Unices.) (例如,请参见在大多数getdents struct dirent如何与getdents使用。)

As others have noted, you cannot have a C array of Foo . 正如其他人指出的那样,您不能拥有C数组Foo However, suppose you are willing to store them irregularly and just need to know how much space could be required. 但是,假设您愿意不定期地存储它们,只需要知道可能需要多少空间即可。 This answer shows that. 这个答案表明。

Let N be the number of Foo objects there are. NFoo对象的数量。

Let S be sizeof(Foo) , which is the size of a Foo object with zero bytes for data . Ssizeof(Foo) ,它是Foo对象的大小,其中data为零字节。

Let A be _Alignof(Foo) . A_Alignof(Foo)

Every Foo object must be start on an address aligned to A bytes. 每个Foo对象必须以与A字节对齐的地址开始。 Let this be A . 让它成为A。 The worst case for padding is that the data array is one byte, requiring that A −1 bytes be skipped before the start of the next Foo . 填充的最坏情况是data数组是一个字节,要求在下一个Foo开始之前跳过A -1个字节。

Therefore, in addition the 1024 bytes consumed by the Foo objects (including their data ), we might need ( N −1)•( A −1) bytes for this padding. 因此,除了Foo对象消耗的1024个字节(包括它们的data )之外,我们可能还需要( N -1)•( A -1)个字节用于此填充。 (The N −1 is because no padding bytes are needed after the last Foo .) N -1是因为最后一个Foo之后不需要填充字节。)

If each Foo has at least one byte of data , the most N could be is floor(1024/( S +1)), because we know that all of the Foo objects and their data use at most 1024 bytes. 如果每个Foo至少有一个字节的data ,则最大N可能是floor(1024 /( S +1)),因为我们知道所有Foo对象及其数据最多使用1024个字节。

Therefore 1024 + floor(1024/( S +1)−1)*( A −1) bytes suffice—1024 bytes for the actual data and floor(1024/( S +1)−1)*( A −1) for the padding. 因此,1024 + floor(1024 /( S +1)-1)*( A -1)字节就足够了-实际数据为1024字节,而floor(1024 /( S +1)-1)*( A -1)为实际数据填充。

Note that the above assumes each Foo has at least one byte of data . 注意,以上假设每个Foo至少具有一个字节的data If one or more Foo have zero bytes of data , N could be larger than floor(1024/( S +1)). 如果一个或多个Foodata字节为零,则N可能大于floor(1024 /( S +1))。 However, after any such Foo , no padding is needed, and N cannot increase by more than one for each such Foo (because reducing the space used by one byte cannot make more for more than one Foo ). 但是,在任何这样的Foo ,不需要填充,并且对于每个这样的Foo N不能增加一个以上(因为减少一个字节使用的空间不能为一个以上Foo产生更多的空间)。 Thus, such a Foo could gives us one more Foo elsewhere that needs A −1 bytes of padding, but it itself does not need padding, so the total amount of padding needed cannot increase. 因此,这样的Foo可以在需要A -1个字节填充的其他地方给我们一个Foo ,但是它本身不需要填充,因此所需填充的总量不能增加。

So, a plan to assign memory for the Foo objects is: 因此,为Foo对象分配内存的计划是:

  • Allocate 1024 + floor(1024/( S +1)−1)*( A −1) bytes. 分配1024 + floor(1024 /( S +1)-1)*( A -1)个字节。
  • Put the first Foo at the start of the allocated memory. 将第一个Foo放在分配的内存的开头。
  • Put each successive Foo at the next A -aligned address after the end of the previous Foo (including its data ). 把每个连续Foo在下一 -aligned地址先前结束后Foo (包括其data )。

This will not yield an array, of course, just a mass of Foo objects within the allocated space. 当然,这不会产生一个数组,只是分配空间内的大量Foo对象。 You will need pointers or other means of addressing them. 您将需要指针或其他解决它们的方法。

Per C 2018 7.22.3.4 2: Per C 2018 7.22.3.4 2:

The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate. malloc函数为size由大小指定且值不确定的对象分配空间。

So, chopping up the space returned by malloc in an irregular way to use for multiple objects is not a good fit for that specification. 因此,以不规则的方式切碎malloc返回的空间以用于多个对象并不适合该规范。 I will leave that for others to discuss, but I have not observed a C implementation have a problem with it. 我将其留给其他人讨论,但是我还没有观察到C实现对它有问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM