简体   繁体   English

我怎样才能保证在释放内存时,操作系统会回收内存以供其使用?

[英]How can I get a guarantee that when a memory is freed, the OS will reclaim that memory for it's use?

I noticed that this program: 我注意到这个程序:

#include <stdio.h>

int main() {
  const size_t alloc_size = 1*1024*1024;
  for (size_t i = 0; i < 3; i++) {
    printf("1\n");
    usleep(1000*1000);
    void *p[3];
    for (size_t j = 3; j--; )
      memset(p[j] = malloc(alloc_size),0,alloc_size); // memset for de-virtualize the memory
    usleep(1000*1000);
    printf("2\n");
    free(p[i]);
    p[i] = NULL;
    usleep(1000*1000*4);
    printf("3\n");
    for (size_t j = 3; j--; )
      free(p[j]);
  }
}

which allocates 3 memories, 3 times and each time frees different memory, frees the memory according to watch free -m , which means that the OS reclaimed the memory for every free regardless of the memory's position inside the program's address space. 它分配3个存储器,3次,每次释放不同的存储器,根据watch free -m释放存储器,这意味着无论存储器在程序地址空间内的位置如何,OS都会为每个free存储器回收存储器。 Can I somehow get a guarantee for this effect? 我可以以某种方式得到这种效果的保证吗? Or is there already anything like that (like a rule of >64KB allocations)? 或者是否已经有类似的东西(比如>64KB分配规则)?

The short answer is: In general, you cannot guarantee that the OS will reclaim the freed memory, but there may be an OS specific way to do it or a better way to ensure such behavior. 简短的回答是:通常,您无法保证操作系统将回收释放的内存,但可能有特定于操作系统的方法来执行此操作或更好的方法来确保此类行为。

The long answer: 答案很长:

  • Your code has undefined behavior: there is an extra free(p[i]); 您的代码具有未定义的行为:有一个额外的free(p[i]); after the printf("2\\n"); printf("2\\n"); which accesses beyond the end of the p array. 它访问超出p数组的末尾。

  • You allocate large blocks (1 MB) for which your library makes individual system calls (for example mmap in linux systems), and free releases these blocks to the OS, hence the observed behavior. 您分配大的块(1 MB)为一个你的库使得各个系统调用(例如mmap Linux系统),并free发布这些块的OS,因此观察到的行为。

  • Various OSes are likely to implement such behavior for a system specific threshold (typically 128KB), but the C standard gives guarantee about this, so relying on such behavior is system specific. 各种操作系统可能会针对系​​统特定阈值(通常为128KB)实现此类行为,但C标准对此提供了保证,因此依赖此类行为是系统特定的。

  • Read the manual page for malloc() on your system to see if this behavior can be controlled. 阅读系统上malloc()的手册页,了解是否可以控制此行为。 For example, the C library on Linux uses an environment variable MMAP_THRESHOLD to override the default setting for this threshold. 例如,Linux上的C库使用环境变量MMAP_THRESHOLD来覆盖此阈值的默认设置。

  • If you program to a Posix target, you might want to use mmap() directly instead of malloc to guarantee that the memory is returned to the system once deallocated with munmap() . 如果编程为Posix目标,则可能需要直接使用mmap()而不是malloc来保证在使用munmap()解除分配后将内存返回给系统。 Note that the block returned by mmap() will have been initialized to all bits zero before the first access, so you may avoid such explicit initialization to take advantage of on demand paging, on perform explicit initialization to ensure the memory is mapped to try and minimize latency in later operations. 请注意, mmap()返回的块在第一次访问之前已经初始化为所有位零,因此您可以避免这种显式初始化以利用按需分页,执行显式初始化以确保内存映射为尝试和减少以后操作中的延迟。

On the OSes I know, and especially on linux: 在我知道的操作系统上,特别是在linux上:

no, you cannot guarantee reuse. 不,你不能保证重复使用。 Why would you want that? 你为什么要那样? Reuse only happens when someone needs more memory pages, and Linux will then have to pick pages that aren't currently mapped to a process; 只有当有人需要更多内存页面时才会重复使用,然后Linux必须选择当前未映射到进程的页面; if these run out, you'll get into swapping. 如果这些用完了,你就会进行交换。 And: you can't make your OS do something that is none of your processes' business. 而且:你不能让你的操作系统做一些与你的流程业务无关的事情。 How it internally manages memory allocations is none of the freeing process' business. 内部管理内存分配的方式不是解放过程的业务。 In fact, that's security-wise a good thing. 事实上,这在安全方面是一件好事。

What you can do is not only freeing the memory (which might leave it allocated to your process, handled by your libc, for later mallocs), but actually giving it back ( man sbrk , man munmap , have fun). 你可以做的不仅是释放内存(可能会将其分配给你的进程,由你的libc处理,以备后来的mallocs),但实际上还给它( man sbrkman munmap ,玩得开心)。 That's not something you'd usually do. 这不是你经常做的事情。

Also: this is yet another instantiation of "help, linux ate my RAM"... you misinterpret what free tells you. 另外:这是另一个实例化的“帮助,linux吃了我的RAM”......你误解了free告诉你的东西。

For glibc malloc() , read the man 3 malloc man page. 对于glibc malloc() ,请阅读man 3 malloc手册页。

In short, smaller allocations use memory provided by sbrk() to extend the data segment; 简而言之,较小的分配使用sbrk()提供的内存来扩展数据段; this is not returned to the OS. 这不会返回给操作系统。 Larger allocations (typically 132 KiB or more; you can use MMAP_THRESHOLD on glibc to change the limit) use mmap() to allocate anonymous memory pages (but also include memory allocation bookkeeping on those pages), and when freed, these are usually immediately returned to the OS. 更大的分配(通常为132 KiB或更多;您可以在glibc上使用MMAP_THRESHOLD来更改限制)使用mmap()来分配匿名内存页面(但也包括这些页面上的内存分配簿记),并且在释放时,这些通常会立即返回到操作系统。

The only case when you should worry about the process returning memory to the OS in a timely manner, is if you have a long-running process, that temporarily does a very large allocation, running on an embedded or otherwise memory-constrained device . 唯一的情况是,如果您有一个长时间运行的进程,暂时执行非常大的分配, 在嵌入式或其他内存受限的设备上运行,那么您应该担心进程及时将内存返回到操作系统。 Why? 为什么? Because this stuff has been done in C successfully for decades, and the C library and the OS kernel do handle these cases just fine. 因为这些东西已经在C中成功完成了几十年,并且C库和OS内核确实处理了这些情况。 It just isn't a practical problem in normal circumstances. 在正常情况下,这不是一个实际问题。 You only need to worry about it, if you know it is a practical problem; 如果你知道这是一个实际问题,你只需要担心它; and it won't be a practical problem except on very specific circumstances. 除非在非常具体的情况下,否则它不会成为实际问题。


I personally do routinely use mmap(2) in Linux to map pages for huge data sets. 我个人经常在Linux中使用mmap(2)来映射大量数据集的页面。 Here, "huge" means "too large to fit in RAM and swap". 在这里,“巨大”意味着“太大而不适合RAM和交换”。

Most common case is when I have a truly huge binary data set. 最常见的情况是我有一个真正巨大的二进制数据集。 Then, I create a (sparse) backing file of suitable size, and memory-map that file. 然后,我创建一个合适大小的(稀疏)后备文件,以及该文件的内存映射。 Years ago, in another forum, I showed an example of how to do this with a terabyte data set -- yes, 1,099,511,627,776 bytes -- of which only 250 megabytes or so was actually manipulated in that example, to keep the data file small. 几年前,在另一个论坛中,我展示了如何使用TB级数据集执行此操作的示例 - 是的,1,099,511,627,776字节 - 其中仅有250兆字节左右在该示例中实际操作,以保持数据文件较小。 The key here in this approach is to use MAP_SHARED | MAP_NORESERVE 这种方法的关键是使用MAP_SHARED | MAP_NORESERVE MAP_SHARED | MAP_NORESERVE to ensure the kernel does not use swap memory for this dataset (because it would be insufficient, and fail), but use the file backing directly. MAP_SHARED | MAP_NORESERVE确保内核不使用交换内存用于此数据集(因为它不够,并且失败),但直接使用文件备份。 We can use madvise() to inform the kernel of our probable access patterns as an optimization, but in most cases it does not have that big of an effect (as the kernel heuristics do a pretty good job of it anyway). 我们可以使用madvise()来通知内核我们可能的访问模式作为优化,但在大多数情况下它没有那么大的影响(因为内核启发式无论如何都做得很好)。 We can also use msync() to ensure certain parts are written to storage. 我们还可以使用msync()来确保将某些部分写入存储。 (There are certain effects that has wrt. other processes that read the file backing the mapping, especially depending on whether they read it normally, or use options like O_DIRECT ; and if shared over NFS or similar, wrt. processes reading the file remotely. It all goes quite complicated very quickly.) (有一些特效,其他进程读取支持映射的文件,特别是取决于它们是否正常读取,或使用O_DIRECT选项;如果通过NFS或类似方式共享,则wrt。进程远程读取文件。这一切都很快变得非常复杂。)

If you do decide to use mmap() to acquire anonymous memory pages, do note that you need to keep track of both the pointer and the length (length being a multiple of page size, sysconf(_SC_PAGESIZE) ), so that you can release the mapping later using munmap() . 如果您决定使用mmap()来获取匿名内存页,请注意您需要跟踪指针和长度(长度是页面大小的倍数, sysconf(_SC_PAGESIZE) ),以便您可以释放稍后使用munmap()进行映射。 Obviously, this is then completely separate from normal memory allocation ( malloc() , calloc() , free() ); 显然,这与正常的内存分配完全分开( malloc()calloc()free() ); but unless you try to use specific addresses, the two will not interfere with each other. 但除非你试图使用特定地址,否则两者不会相互干扰。

If you want memory to be reclaimed by the operating system you need to use operating system services to allocate the memory (which be allocated in pages). 如果您希望操作系统回收内存,则需要使用操作系统服务来分配内存(在页面中分配)。 Deallocate the memory, you need to call the operating system services that remove pages from your process. 取消分配内存,您需要调用从您的进程中删除页面的操作系统服务。

Unless you write your own malloc/free that does this, you are never going to be able to accomplish your goal with off-the-shelf library functions. 除非您自己编写malloc / free,否则您将无法使用现成的库函数来实现目标。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM