简体   繁体   English

C使用构造函数创建结构数组

[英]C create array of struct using constructor function

I have a C struct: 我有一个C结构:

typedef struct {
  Dataset *datasets;
  int nDatasets;
  char *group_name;      
  enum groupType type;  
} DatasetGroup; 

It has a constructor function like this: 它具有如下构造函数:

DatasetGroup * new_DatasetGroup(char *group_name, enum groupType type, enum returnCode *ret)
{
    DatasetGroup *dg;
    dg = (DatasetGroup *) malloc(sizeof(DatasetGroup));
    if (dg == NULL)
    {
     *ret = EMEMORY_ERROR;
    }

    // Allocate space for a few datasets
    dg->datasets = malloc(sizeof(Dataset) * INCREMENT);
    if (dg->datasets == NULL)
    {
        *ret = EMEMORY_ERROR;
    }
    dg->group_name= malloc(sizeof(char) * strlen(group_name));
    strcpy(dg->group_name, group_name);
    dg->type = type;
    groupCount++;
    return dg;
  }

I want to dynamically create an array of these structs. 我想动态创建这些结构的数组。 Whats the best way to do this? 最好的方法是什么?

So far I have something like: 到目前为止,我有这样的事情:

   DatasetGroup * make_array(){

     DatasetGroup *dg_array;
     // Allocate space for a few groups
     dg_array = (DatasetGroup *) malloc(sizeof(DatasetGroup) * INCREMENT);

     return dg_array;

   }

   void add_group_to_array(DatasetGroup *dg_array, ...){
      // Add a datasetgroup
      DatasetGroup *dg = new_DatasetGroup(...);

      // groupCount - 1 as the count is incremented when the group is created, so will always be one ahead of the array index we want to assign to
      dg_array[groupCount - 1] = dg;

     if (groupCount % INCREMENT == 0)
     {
      //Grow the array
      dg_array = realloc(dg_array, sizeof(DatasetGroup) * (groupCount + INCREMENT));
     }
 }

But this doesnt seem right.... any ideas? 但这似乎不对。。。有什么想法吗?

A few suggestions: 一些建议:

  1. You have groupCount being incremented by the constructor function of the struct. 您可以通过结构的构造函数将groupCount递增。 This means you can only have one array of the struct that uses your array function. 这意味着只能有一个使用数组函数的结构数组。 I would recommend having the array be responsible for managing the count. 我建议让阵列负责管理计数。
  2. To that affect if you want to have a managed array I would create a struct for that and have it keep both the pointer to the array,the number of objects and the size of the array (eg the maximum number of structs it can currently hold) 为此,如果您想拥有一个托管数组,我将为此创建一个结构,并使其同时保留指向该数组的指针,对象数和数组的大小(例如,当前可容纳的最大结构数) )
  3. If you keep proper track of how many elements you have and the size of the array you can replace groupCount % INCREMENT == 0 with something like groupCount == arraySize which is a lot more intuitive in my opinion. 如果您正确地跟踪有多少个元素以及数组的大小,则可以使用groupCount % INCREMENT == 0 groupCount == arraySize东西替换groupCount % INCREMENT == 0 ,我认为这要直观得多。
  4. You can avoid the second malloc in the constructor all together by having the array be an array of the elements instead of an array of pointers. 通过使数组是元素数组而不是指针数组,可以避免在构造函数中使用第二个malloc。 The constructor than then just initialize the struct members instead of allocating memory. 然后,构造函数只初始化结构成员,而不分配内存。 If you are doing this a lot you will be avoiding a lot of memory fragmentation. 如果您经常这样做,您将避免很多内存碎片。
  5. Finally, while this depends on your application, I usually recommend when you realloc do not increase by a constant but instead of by a multiple of the current array size. 最后,尽管这取决于您的应用程序,但我通常建议您在重新分配时不要增加一个常量,而不要增加当前数组大小的倍数。 If say you double the array size you only have to do log_2 n number of reallocs with n being the final array size and you waste at most half of memory (memory is generally cheap, like I said it depends on the application). 如果说将数组大小增加一倍,则只需执行log_2 n个重新分配的操作,其中n是最终数组大小,并且最多浪费一半的内存(内存通常很便宜,就像我说的那样,取决于应用程序)。 If that is wasting to much memory you can do say 1.5. 如果那浪费了很多内存,您可以说1.5。 If you want a more detailed explanation of this I recommend this Joel on Software article, the part about realloc is about 2/3 down. 如果您想对此进行更详细的说明,我建议这篇关于软件的Joel文章,关于realloc的部分大约减少了2/3。

Update: 更新:

A few others things: 其他一些事情:

dg = (DatasetGroup *) malloc(sizeof(DatasetGroup));
if (dg == NULL)
{
 ret = EMEMORY_ERROR;
}

// Allocate space for a few datasets
dg->datasets = malloc(sizeof(Dataset) * INCREMENT);

As previously pointed out is very bad as you will us dg even if it is NULL. 如前所述,这是非常糟糕的,即使它为NULL,您也会使用dg。 You probably want to exit right after detecting the error. 您可能想在检测到错误后立即退出。

Furthermore you are setting ret but ret is passed by value so it will not be changed for the caller if the callee changes it. 此外,您正在设置ret,但是ret是通过值传递的,因此,如果被叫方更改了值,则调用者将不会更改它。 Instead you probably want to pass a pointer and dereference it. 相反,您可能想传递一个指针并取消引用它。

Update 2: Can I give an example, sure, quick not so much ;-D. 更新2:我可以举一个例子吗?

Consider the following code (I apologize if there are any mistakes, still half asleep): 考虑下面的代码(如果有任何错误,我仍然道歉,仍然会睡着一半):

#include <stdio.h>
#include <stdlib.h>

#define LESS_MALLOCS

#define MAX_COUNT 100000000

typedef struct _foo_t
{
  int bar1;
  int bar2;
} foo_t;

void foo_init(foo_t *foo, int bar1, int bar2)
{
  foo->bar1 = bar1;
  foo->bar2 = bar2;
}

foo_t* new_foo(int bar1, int bar2)
{
  foo_t *foo = malloc(sizeof(foo_t));
  if(foo == NULL) {
    return NULL;
  }
  foo->bar1 = bar1;
  foo->bar2 = bar2;
  return foo;
}

typedef struct _foo_array_t
{
#ifdef LESS_MALLOCS
  foo_t *array;
#else
  foo_t **array;
#endif
  int count;
  int length;
} foo_array_t;

void foo_array_init(foo_array_t* foo_array, int size) {
  foo_array->count = 0;
#ifdef LESS_MALLOCS
  foo_array->array = malloc(sizeof(foo_t) * size);
#else
  foo_array->array = malloc(sizeof(foo_t*) * size);
#endif
  foo_array->length = size;
}

int foo_array_add(foo_array_t* foo_array, int bar1, int bar2)
{
  if(foo_array->count == foo_array->length) {
#ifdef LESS_MALLOCS
    size_t new_size = sizeof(foo_t) * foo_array->length * 2;
#else
    size_t new_size = sizeof(foo_t*) * foo_array->length * 2;
#endif
    void* tmp = realloc(foo_array->array, new_size);

    if(tmp == NULL) {
      return -1;
    }
    foo_array->array = tmp;
    foo_array->length *= 2;
  }
#ifdef LESS_MALLOCS
  foo_init(&(foo_array->array[foo_array->count++]), bar1, bar2);
#else
  foo_array->array[foo_array->count] = new_foo(bar1, bar2);
  if(foo_array->array[foo_array->count] == NULL) {
    return -1;
  }
  foo_array->count++;
#endif
  return foo_array->count;
}


int main()
{
  int i;
  foo_array_t foo_array;
  foo_array_init(&foo_array, 20);
  for(i = 0; i < MAX_COUNT; i++) {
    if(foo_array_add(&foo_array, i, i+1) != (i+1)) {
      fprintf(stderr, "Failed to add element %d\n", i);
      return EXIT_FAILURE;
    }
  }

  printf("Added all elements\n");
  return EXIT_SUCCESS;
}

There is a struct ( foo_t ) with two members ( bar1 and bar2 ) and another struct that is an array wrapper ( foo_array_t ). 有一个结构( foo_t ),它有两个成员( bar1bar2 ),另一个结构是数组包装器( foo_array_t )。 foo_array_t keeps track of the current size of the array and the number of elements in the array. foo_array_t跟踪当前数组的大小和数组中元素的数量。 It has an add element function ( foo_array_add ). 它具有添加元素功能( foo_array_add )。 Note that there is a foo_init and a new_foo , foo_init takes a pointer to a foo_t and new_foo does not and instead returns a pointer. 需要注意的是有一个foo_initnew_foofoo_init需要一个指向foo_tnew_foo没有,而是返回一个指针。 So foo_init assumes the memory has been allocated in some way, heap, stack or whatever doesn't matter, while new_foo will allocate memory from the heap. 因此, foo_init假定已以某种方式(堆,堆栈或其他new_foo方式)分配了内存,而new_foo将从堆中分配内存。 There is also a preprocess macro called LESS_MALLOCS . 还有一个称为LESS_MALLOCS的预处理宏。 This changes the definition of the array member of foo_array_t , the size of the initial array allocation, the size during reallocation and whether foo_init or new_foo is used. 这将更改foo_array_t的数组成员的foo_array_t ,初始数组分配的大小,重新分配期间的大小以及是否使用foo_initnew_foo The array and its size have to change to reflect whether a pointer or the actually element is in the array. 数组及其大小必须更改以反映指针还是实际元素在数组中。 With LESS_MACRO defined the code is following my suggestion for number 4, when not, it is more similar to your code. 使用LESS_MACRO定义的代码遵循我对数字4的建议,否则,它与您的代码更相似。 Finally, main contains a simple micro-benchmark. 最后, main包含一个简单的微基准。 The results are the following: 结果如下:

[missimer@asus-laptop tmp]$ gcc temp.c # Compile with LESS_MACROS defined
[missimer@asus-laptop tmp]$ time ./a.out 
Added all elements

real    0m1.747s
user    0m1.384s
sys     0m0.357s
[missimer@asus-laptop tmp]$ gcc temp.c #Compile with LESS_MACROS not defined 
[missimer@asus-laptop tmp]$ time ./a.out 
Added all elements

real    0m9.360s
user    0m4.804s
sys     0m1.968s

Not that time is the best way to measure a benchmark but in this case I think the results speak for themselves. 并不是说time是衡量基准的最佳方法,但在这种情况下,我认为结果不言而喻。 Also, when you allocate an array of elements instead of an array of pointers and then allocate the elements separately you reduce the number of places you have to check for errors. 同样,当您分配一个元素数组而不是一个指针数组,然后分别分配元素时,您减少了必须检查错误的位置数。 Of course everything has trade-offs, if for example the struct was very large and you wanted to move elements around in the array you would be doing a lot of memcpy -ing as opposed to just moving a pointer around in your approach. 当然,所有东西都需要权衡取舍,例如,如果结构非常大,并且您想在数组中移动元素,那么您将需要进行大量的memcpy -ing操作,而不仅仅是在方法中移动指针。

Also, I would recommend against this: 另外,我建议不要这样做:

dg_array = realloc(dg_array, sizeof(DatasetGroup) * (groupCount + INCREMENT));

As you lose the value of the original pointer if realloc fails and returns NULL . 当realloc失败并返回NULL您将丢失原始指针的NULL Also like your previous ret, you should pass a pointer instead of the value as you are not changing the value to the caller, just the callee which then exits so it has no real affect. 就像以前的ret一样,您应该传递一个指针而不是值,因为您没有将值更改给调用方,只是被调用方退出了,因此没有实际影响。 Finally, I noticed you changed your function definition to have a pointer to ret but you need to dereference that pointer when you use it, you should be getting compiler warnings (perhaps even errors) when you do try what you currently have. 最后,我注意到您已将函数定义更改为具有指向ret的指针,但在使用该指针时需要取消引用该指针,因此在尝试使用当前指针时应该会收到编译器警告(甚至可能是错误)。

You could do two things, either you dynamically create an array of struct pointers, then call your new function to create N datagroups, or you could dynamically request memory for N structures at once, this would mean your N structures would be contiguously allocated. 您可以做两件事,要么动态创建一个结构指针数组,然后调用新函数来创建N个数据组,要么可以一次动态地为N个结构请求内存,这意味着您的N个结构将被连续分配。

Datagroup **parry = malloc(sizeof(datagroup *) * N)
for (int i = 0; i < N; i++){
    parry[i] = //yourconstructor
}

Or 要么

//allocate N empty structures
Datagroup *contarr = calloc(N, sizeof(Datagroup))

The second method might need a different initialization routine than your constructor, as the memory is already allocated 第二种方法可能需要与构造函数不同的初始化例程,因为已经分配了内存

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM