简体   繁体   English

为什么数组分配后堆这么大

[英]Why is the heap after array allocation so large

I've got a very basic application that boils down to the following code:我有一个非常基本的应用程序,可以归结为以下代码:

char* gBigArray[200][200][200];
unsigned int Initialise(){  
    for(int ta=0;ta<200;ta++)
        for(int tb=0;tb<200;tb++)
            for(int tc=0;tc<200;tc++)
                gBigArray[ta][tb][tc]=new char;
    return sizeof(gBigArray);
}

The function returns the expected value of 32000000 bytes, which is approximately 30MB, yet in the Windows Task Manager (and granted it's not 100% accurate) gives a Memory (Private Working Set) value of around 157MB. The function returns the expected value of 32000000 bytes, which is approximately 30MB, yet in the Windows Task Manager (and granted it's not 100% accurate) gives a Memory (Private Working Set) value of around 157MB. I've loaded the application into VMMap by SysInternals and have the following values:我已通过 SysInternals 将应用程序加载到 VMMap 中,并具有以下值:

I'm unsure what Image means (listed under Type), although irrelevant of that its value is around what I'm expecting.我不确定 Image 是什么意思(列在 Type 下),尽管它的价值与我的预期无关。 What is really throwing things out for me is the Heap value, which is where the apparent enormous size is coming from.真正让我失望的是堆值,这是明显的巨大尺寸的来源。

What I don't understand is why this is?我不明白这是为什么? According to this answer if I've understood it correctly, gBigArray would be placed in the data or bss segment - however I'm guessing as each element is an uninitialised pointer it would be placed in the bss segment.根据这个答案,如果我理解正确,gBigArray 将被放置在数据或 bss 段中 - 但是我猜测每个元素都是一个未初始化的指针,它将被放置在 bss 段中。 Why then would the heap value be larger by a silly amount than what is required?那么为什么堆值会比所需的值大得多?

It doesn't sound silly if you know how memory allocators work.如果您知道 memory 分配器是如何工作的,这听起来并不傻。 They keep track of the allocated blocks so there's a field storing the size and also a pointer to the next block, perhaps even some padding.它们跟踪分配的块,因此有一个存储大小的字段以及指向下一个块的指针,甚至可能是一些填充。 Some compilers place guarding space around the allocated area in debug builds so if you write beyond or before the allocated area the program can detect it at runtime when you try to free the allocated space.一些编译器在调试版本中在分配区域周围放置保护空间,因此如果您在分配区域之外或之前写入,当您尝试释放分配的空间时,程序可以在运行时检测到它。

you are allocating one char at a time.您一次分配一个字符。 There is typically a space overhead per allocation每个分配通常都有空间开销

Allocate the memory on one big chunk (or at least in a few chunks)将 memory 分配在一大块(或至少在几个块中)

Do not forget that char* gBigArray[200][200][200];不要忘记char* gBigArray[200][200][200]; allocates space for 200*200*200=8000000 pointers, each word size.为每个字大小的200*200*200=8000000个指针分配空间。 That is 32 MB on a 32 bit system.即 32 位系统上的 32 MB。

Add another 8000000 char 's to that for another 8MB.再添加8000000 char ,再增加 8MB。 Since you are allocating them one by one it probably can't allocate them at one byte per item so they'll probably also take the word size per item resulting in another 32MB (32 bit system).由于您正在逐一分配它们,因此可能无法以每个项目一个字节的方式分配它们,因此它们可能还会占用每个项目的字大小,从而导致另一个 32MB(32 位系统)。

The rest is probably overhead, which is also significant because the C++ system must remember how many elements an array allocated with new contains for delete [] . rest 可能是开销,这也很重要,因为 C++ 系统必须记住分配了多少元素的数组new contains for delete []

Owww.哎呀。 My embedded systems stuff would roll over and die if faced with that code, Each allocation has quite a bit of extra info associated with it and either is spaced to a fixed size.如果面对该代码,我的嵌入式系统的东西会翻车并死掉,每个分配都有相当多的额外信息与之相关联,并且它们的间距都是固定的。 or is managed via a linked list type object, On my system.或通过链接列表类型 object 管理,在我的系统上。 that 1 char new would become a 64 byte allocation out of a small object allocator such that management would be in O(1) time, But in other systems, this could easily fragment your memory horribly, make subsequent new and deletes run extremely slowly O(n) where n is number of things it tracks, and in general bring doom upon an app over time as each char would become at least a 32 byte allocation and be placed in all sorts of cubby holes in memory. 1 char new 将成为一个小的 object 分配器中的 64 字节分配,这样管理将在 O(1) 时间内完成,但在其他系统中,这很容易将您的 memory 碎片化,非常糟糕,使后续的新建和删除运行非常缓慢 O (n) 其中 n 是它跟踪的事物的数量,随着时间的推移,通常会给应用程序带来厄运,因为每个 char 将成为至少 32 字节的分配,并被放置在 memory 中的各种小孔中。 thus pushing your allocation heap out much further than you might expect.从而使您的分配堆超出您的预期。

Do a single large allocation and map your 3D array over it if you need to with a placement new or other pointer trickery.如果您需要使用新的放置或其他指针技巧,请在其上执行单个大分配和 map 您的 3D 数组。

Edited out of the above post into a community wiki post :将上述帖子编辑为社区 wiki 帖子

As the answers below say, the issue here is I am creating a new char 200^3 times, and although each char is only 1 byte, there is overhead for every object on the heap.正如下面的答案所说,这里的问题是我正在创建一个新的字符 200^3 次,虽然每个字符只有 1 个字节,但堆上的每个 object 都有开销。 It seems creating a char array for all chars knocks the memory down to a more believable level:似乎为所有字符创建一个字符数组将 memory 降低到更可信的水平:

char* gBigArray[200][200][200];
char* gCharBlock=new char[200*200*200];
unsigned int Initialise(){  
    unsigned int mIndex=0;
    for(int ta=0;ta<200;ta++)
        for(int tb=0;tb<200;tb++)
            for(int tc=0;tc<200;tc++)
                gBigArray[ta][tb][tc]=&gCharBlock[mIndex++];
    return sizeof(gBigArray);
}

Allocating 1 char at a time is probably more expensive.一次分配 1 个字符可能更昂贵。 There are metadata headers per allocation so 1 byte for a character is smaller than the header metadata so you might actually save space by doing one large allocation (if possible) that way you mitigate the overhead of each individual allocation having its own metadata.每个分配都有元数据标头,因此一个字符的 1 个字节小于 header 元数据,因此您实际上可以通过进行一次大分配(如果可能)来节省空间,这样您就可以减轻每个具有自己元数据的单独分配的开销。

Perhaps this is an issue of memory stride?也许这是 memory 步幅的问题? What size of gaps are between values?值之间的差距有多大?

30 MB is for the pointers . 30 MB 用于指针 The rest is for the storage you allocated with the new call that the pointers are pointing to . rest 用于您通过指针指向new调用分配的存储空间。 Compilers are allowed to allocate more than one byte for various reasons, like to align on word boundaries, or give some growing room in case you want it later.由于各种原因,编译器可以分配多个字节,例如在字边界上对齐,或者提供一些增长空间以防您以后需要它。 If you want 8 MB worth of characters, leave the * off your declaration for gBigArray .如果您想要 8 MB 的字符,请在您的gBigArray声明中保留*

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM