简体   繁体   English

为什么用空函数运行std :: thread会花费大量内存

[英]Why running std::thread with empty function spend a lot of memory

I wrote a simple program which should run two threads, sort small array (~4096 Byte) and write into an output file. 我写了一个简单的程序,它应该运行两个线程,排序小数组(~4096字节)并写入输出文件。 Input data contain in the one big file (~4Gb). 输入数据包含在一个大文件中(~4Gb)。 Computer has 128MB memory. 电脑有128MB内存。 I found that running just empty main function use 14MB memory. 我发现只运行空主函数使用14MB内存。 If run std::thread with empty function application start to use ~8MB per thread. 如果使用空函数应用程序运行std :: thread,则每个线程使用~8MB。 BUT if i make just one dynamic memory allocation program starts to use approximately 64Mb per thread. 但是,如果我只做一个动态内存分配程序,每个线程开始使用大约64Mb。 I don't understand what can spend so much memory. 我不明白什么可以花这么多的记忆。 How can I control this size? 我该如何控制这个尺寸? And how allocate dynamic memory to minimize some system default allocation? 如何分配动态内存以最小化一些系统默认分配?

  • System: Ubuntu 14.04.3 系统: Ubuntu 14.04.3
  • Compiler: gcc 4.8.4 编译器: gcc 4.8.4
  • Compiler option:' -std=c++11 -O3 -pthread ' 编译器选项:' - std = c ++ 11 -O3 -pthread '

  • This is a code example 这是一个代码示例

     void dummy(void) { std::vector<unsigned int> g(1); int i = 0; while( i<500000000) { ++i; } } int main(void) { std::thread t1(&dummy); std::thread t2(&dummy); std::thread t3(&dummy); t1.join(); t2.join(); t3.join(); return 0; } 

Every thread has its own stack. 每个线程都有自己的堆栈。 On Linux, the default stack size is 8 MB. 在Linux上,默认堆栈大小为8 MB。 When you start allocating memory for the first time, the heap memory allocator might actually reserve a big chunk up front. 当您第一次开始分配内存时,堆内存分配器实际上可能预先保留一个大块。 This might explain the 64 MB per thread you are seeing. 可能解释了您看到的每个线程64 MB。

That said, when I say "allocated", that doesn't mean that this memory is really used. 也就是说,当我说“已分配”时,这并不意味着这个内存真的被使用了。 The allocation happens in the virtual memory space of the process. 分配发生在进程的虚拟内存空间中。 This is what you see under the column VSZ when you run ps or under the column VIRT when you run top . 这是您在运行ps时在VSZ列下看到的,或者在运行top时在VIRT列下看到的。 But Linux knows that you probably are not going to use most of that allocated memory anyway. 但Linux知道你可能不会使用大部分分配的内存。 So while you have allocated a chunk of virtual memory, Linux does not allocate any physical memory to back that up, until the process actually starts writing to that memory. 因此,当您分配了一大块虚拟内存时,Linux不会分配任何物理内存来支持它,直到该进程实际开始写入该内存。 The real physical amount of memory used by a process is seen under RSS for ps and RES for top . 进程使用的实际物理内存量在RSS下显示为psREStop Linux allows more virtual memory to be allocated than there is physical memory in total. Linux允许分配比总共物理内存更多的虚拟内存。

Even though you might not run out of physical memory, if you have a lot of threads on a 32-bit system, each of which is allocating 8 MB of virtual memory, you might run out of the virtual memory space of your process (which is in the order of 2 GB). 即使您的物理内存可能不足,如果32位系统上有很多线程,每个线程都分配8 MB的虚拟内存,那么您的进程的虚拟内存空间可能会耗尽(大约是2 GB)。 While C++'s thread library does not allow you to change the size of the stack, the C pthreads library allows you to do this by supplying pthread_create() with a pthread_attr_t which you adjusted using pthread_attr_setstacksize() . 虽然C ++的线程库不允许您更改堆栈的大小,但C pthreads库允许您通过提供pthread_create()和使用pthread_attr_setstacksize()调整的pthread_attr_t来完成此操作。 See also this stackoverflow question . 另请参阅此stackoverflow问题

The value that you reported for ulimit -s in the comments above does suggest that the thread is still allocating a stack even if it is an empty main. 您在上面的注释中为ulimit -s报告的值确实表明该线程仍在分配堆栈,即使它是一个空的main。 The function call that is executed in the thread would require a stack to pass a return address assuming you're on x86. 假设你在x86上,在线程中执行的函数调用将需要一个堆栈来传递一个返回地址。

@Karrek SB is heading in the right direction with this. @Karrek SB正朝着正确的方向前进。 The allocator that you are using can affect the heap size for your program. 您正在使用的分配器可能会影响程序的堆大小。 In order to avoid repeated calls to brk or sbrk, allocators will usually request larger initial blocks of memory. 为了避免重复调用brk或sbrk,分配器通常会请求更大的初始内存块。 It isn't unreasonable to expect values in the order of MB -- especially values that align nicely along typical page boundaries such as 4, 8, 32, 64, etc. when the allocator is first initialized. 期望以MB为单位的值是不合理的 - 尤其是在首次初始化分配器时,沿着典型页面边界(例如4,8,32,64等)很好地对齐的值。

To control how much memory is allocated, your results may vary. 要控制分配的内存量,您的结果可能会有所不同。 See if your allocator supports the mallopt function. 看看你的allocator是否支持mallopt函数。 With a bit of trial and error, you may be able to reduce your overall memory footprint. 通过一些试验和错误,您可以减少总体内存占用。 Else, you could always implement your own allocator. 否则,您总是可以实现自己的分配器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM