简体   繁体   English

分配大量char *时,为什么内存大小增加一倍?

[英]Why is memory size doubled when allocating a large number of char*?

I allocate a 2D array of char * and every string length is 12. 我分配了一个char *二维数组,每个字符串长度为12。
50 rows and 2000000 columns. 50行和2000000列。

Lets calculate it: 50*2000000 * (12(length)+8(for pointer)) . 让我们计算一下: 50*2000000 * (12(length)+8(for pointer)) I use 64 bit. 我使用64位。

50*2000000 * 20 =2000000000 bits .. -> 2 GB. 50 * 2000000 * 20 = 2000000000位..-> 2 GB。

When I check the memory monitor it shows that the process takes 4 GB. 当我检查内存监视器时,它表明该过程占用了4 GB。
(All that happened after allocation) (分配之后发生的所有事情)

This is the code: 这是代码:

int col=2000000,row=50,i=0,j=0;
char *** arr;
arr=(char***)malloc(sizeof(char**)*row);
for(i=0;i<row;i++)
{
arr[i]=(char ** )malloc(sizeof(char*)*col);
    for(j=0;j<col;j++)
     {
         arr[i][j]=(char*)malloc(12);
         strcpy(arr[i][j],"12345678901");
         arr[i][j][11]='\0';
     }
}

May that be from the paging in Linux? 可能来自Linux中的分页吗?

Each call of malloc is taking more memory than you ask. 每次调用malloc占用的内存都比您要的多。 Malloc needs to store somewhere its internal info about allocated place, like size of allocated space, some info about neighbors chunks, etc. Also (very probably) each returned pointer is aligned to 16 bytes. Malloc需要在某个地方存储有关已分配位置的内部信息,例如已分配空间的大小,一些有关邻居块的信息等。而且(很可能)每个返回的指针都对齐为16个字节。 In my estimation each allocation of 12 bytes takes 32 bytes of memory. 据我估计,每分配12个字节会占用32个字节的内存。 If you want to save memory allocate all strings in one malloc and split them into sizes per 12 at your own. 如果要节省内存,请在一个malloc中分配所有字符串,并自行将它们拆分为每12个大小。 Try the following: 请尝试以下操作:

int col=2000000,row=50,i=0,j=0;
char *** arr;
arr= malloc(sizeof(*arr)*row);
for(i=0;i<row;i++)
{ 
  arr[i]= malloc(sizeof(*arr[i])*col);
  char *colmem = malloc(12 * col);
  for(j=0;j<col;j++)
  {
     arr[i][j] = colmem + j*12;
     strcpy(arr[i][j],"12345678901");
  }
}

I would re-write the code from scratch. 我将从头开始重新编写代码。 For some reason, around 99% of all C programmers don't know how to correctly allocate true 2D arrays dynamically. 由于某种原因,大约99%的C程序员都不知道如何正确地动态分配真正的2D数组。 I'm not even sure I'm one of the 1% who do, but lets give it a shot: 我什至不确定自己是否是1%的人之一,但让我们试一试:

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

int main()
{
  const int COL_N = 2000000;
  const int ROW_N = 50;

  char (*arr)[ROW_N] = malloc( sizeof(char[COL_N][ROW_N]) );

  if(arr == NULL)
  {
    printf("Out of memory");
    return 0;
  }

  for(int row=0; row<ROW_N; row++)
  {
    strcpy(arr[row], "12345678901");
    puts(arr[row]);
  }

  free(arr);

  return 0;
}

The important parts here are: 这里的重要部分是:

  • You should always allocate multi-dimensional arrays in adjacent memory cells or they are not arrays , but rather pointer-based lookup tables. 您应该始终在相邻的存储单元中分配多维数组,否则它们不是array ,而是基于指针的查找表。 Thus you only need one single malloc call. 因此,您只需要一个malloc调用。
  • This should save a bit of memory since you only need one pointer and it is allocated on the stack. 这将节省一点内存,因为您只需要一个指针,并且它已分配在堆栈上。 No pointers are allocated on the heap. 没有在堆上分配指针。
  • Casting the return value of malloc is pointless (but not dangerous on modern compilers). 强制转换malloc的返回值是没有意义的(但在现代编译器中并不危险)。
  • Ensure that malloc actually worked, particularly when allocating ridiculous amounts of memory. 确保malloc实际上有效,尤其是在分配荒谬的内存量时。
  • strcpy copies the null termination, you don't need to do it manually. strcpy复制空终止,您无需手动执行。
  • There is no need for nested loops. 不需要嵌套循环。 You want to allocate a 2D array, not a 3D one. 您要分配一个2D数组,而不是3D数组。
  • Always clean up your own mess with free(), even though the OS might do it for you. 始终使用free()清理自己的混乱情况,即使操作系统可能为您完成任务。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM