简体繁体 English

mmapped大文件的窗口控制（linux，mmap）

[英]Window control for mmapped large file(linux, mmap)

原文 2022-12-17 17:55:38 4 1 c/ linux/ linux-kernel/ mmap/ large-files

How can we control the window in RSS when mapping a large file?映射大文件时如何控制RSS中的窗口？ Now let me explain what i mean.现在让我解释一下我的意思。 For example, we have a large file that exceeds RAM by several times, we do shared memory mmaping for several processes, if we access some object whose virtual address is located in this mapped memory and catch a page fault, then reading from disk, the sub-question is, will the opposite happen if we no longer use the given object?例如，我们有一个超过 RAM 数倍的大文件，我们为多个进程进行共享内存映射，如果我们访问某个虚拟地址位于此映射内存中的对象并捕获页面错误，然后从磁盘读取，子问题是，如果我们不再使用给定的对象，是否会发生相反的情况？ If this happens like an LRU, then what is the size of the LRU and how to control it?如果像LRU一样发生这种情况，那么LRU的大小是多少，如何控制呢？ How is page cache involved in this case?这种情况下如何涉及页面缓存？

RSS graph RSS图

This is the RSS graph on testing instance(2 thread, 8 GB RAM) for 80 GB tar file.这是 80 GB tar 文件的测试实例（2 线程，8 GB RAM）的 RSS 图。 Where does this value of 3800 MB come from and stay stable when I run through the file after it has been mapped?这个 3800 MB 的值是从哪里来的，并且当我在文件被映射后运行它时保持稳定？ How can I control it (or advise the kernel to control it)?我如何控制它（或建议内核控制它）？

1 个解决方案

As long as you're not taking explicit action to lock the pages in memory, they should eventually be swapped back out automatically.只要您不采取显式操作将页面锁定在内存中，它们最终就会自动换回。 The kernel basically uses a memory pressure heuristic to decide how much of physical memory to devote to swapped-in pages, and frequently rebalances as needed.内核基本上使用内存压力试探法来决定将多少物理内存用于换入页面，并根据需要经常重新平衡。

If you want to take a more active role in controlling this process, have a look at the madvise() system call .如果您想在控制此过程中发挥更积极的作用，请查看madvise()系统调用。

This allows you to tweak the paging algorithm for your mmap, with actions like:这允许您调整您的 mmap 的分页算法，操作如下：

MADV_FREE (since Linux 4.5) MADV_FREE （自 Linux 4.5 起）
- The application no longer requires the pages in the range specified by addr and len.应用程序不再需要 addr 和 len 指定范围内的页面。 The kernel can thus free these pages, but the freeing could be delayed until memory pressure occurs.内核因此可以释放这些页面，但释放可能会延迟到发生内存压力时。 ... ...
MADV_COLD (since Linux 5.4) MADV_COLD （自 Linux 5.4 起）
- Deactivate a given range of pages.停用给定范围的页面。 This will make the pages a more probable reclaim target should there be a memory pressure.如果存在内存压力，这将使页面更有可能成为回收目标。
MADV_SEQUENTIAL
- Expect page references in sequential order.期望页面引用按顺序排列。 (Hence, pages in the given range can be aggressively read ahead, and may be freed soon after they are accessed.) （因此，给定范围内的页面可以被积极地提前读取，并可能在访问后很快被释放。）
MADV_WILLNEED
- Expect access in the near future.期待在不久的将来访问。 (Hence, it might be a good idea to read some pages ahead.) （因此，提前阅读一些页面可能是个好主意。）
MADV_DONTNEED
- Do not expect access in the near future.不要指望在不久的将来访问。 (For the time being, the application is finished with the given range, so the kernel can free resources associated with it.)... （暂时，应用程序在给定范围内完成，因此内核可以释放与其关联的资源。）...

Issuing an madvise(MADV_SEQUENTIAL) after creating the mmap might be sufficient to get acceptable behavior.在创建mmap后发出madvise(MADV_SEQUENTIAL)可能足以获得可接受的行为。 If not, you could also intersperse some MADV_WILLNEED / MADV_DONTNEED access hints (and/or MADV_FREE / MADV_COLD ) during the traversal as you pass groups of pages.如果没有，您还可以在遍历页面组时穿插一些MADV_WILLNEED / MADV_DONTNEED访问提示（和/或MADV_FREE / MADV_COLD ）。