简体繁体 English

处理内存池中的碎片？

[英]Dealing with fragmentation in a memory pool?

原文 2011-10-22 04:47:11 7 5 c++/ memory/ pool/ fragmentation

Suppose I have a memory pool object with a constructor that takes a pointer to a large chunk of memory ptr and size N. If I do many random allocations and deallocations of various sizes I can get the memory in such a state that I cannot allocate an M byte object contiguously in memory even though there may be a lot free! 假设我有一个带有构造函数的内存池对象，该构造函数获取指向大块内存ptr和大小为N的指针。如果我做了许多随机分配和各种大小的解除分配，我可以获得内存，使得我无法分配M字节对象在内存中连续存在，即使可能有很多空闲！ At the same time, I can't compact the memory because that would cause a dangling pointer on the consumers. 与此同时，我无法压缩内存，因为这会导致消费者的悬空指针。 How does one resolve fragmentation in this case? 在这种情况下如何解决碎片问题？

5 个解决方案

I wanted to add my 2 cents only because no one else pointed out that from your description it sounds like you are implementing a standard heap allocator (ie what all of us already use every time when we call malloc() or operator new). 我想加上我的2美分只是因为没有人指出你的描述听起来好像你正在实现一个标准的堆分配器（即我们所有人在每次调用malloc（）或operator new时都已经使用过）。

A heap is exactly such an object, that goes to virtual memory manager and asks for large chunk of memory (what you call "a pool"). 堆就是这样一个对象，进入虚拟内存管理器并请求大块内存（你称之为“池”）。 Then it has all kinds of different algorithms for dealing with most efficient way of allocating various size chunks and freeing them. 然后它有各种不同的算法来处理分配各种大小的块并释放它们的最有效方法。 Furthermore, many people have modified and optimized these algorithms over the years. 此外，许多人多年来一直在修改和优化这些算法。 For long time Windows came with an option called low-fragmentation heap (LFH) which you used to have to enable manually. 很长一段时间，Windows都提供了一个称为低碎片堆（LFH）的选项，您以前必须手动启用它。 Starting with Vista LFH is used for all heaps by default. 从Vista开始LFH默认用于所有堆。

Heaps are not perfect and they can definitely bog down performance when not used properly. 堆不完美，如果使用不当，它们肯定会使性能陷入困境。 Since OS vendors can't possibly anticipate every scenario in which you will use a heap, their heap managers have to be optimized for the "average" use. 由于操作系统供应商无法预测您将使用堆的每个场景，因此他们的堆管理器必须针对“平均”使用进行优化。 But if you have a requirement which is similar to the requirements for a regular heap (ie many objects, different size....) you should consider just using a heap and not reinventing it because chances are your implementation will be inferior to what OS already provides for you. 但是如果你的需求类似于常规堆的需求（即许多对象，不同大小......），你应该考虑只使用堆而不是重新发明它，因为你的实现可能性不如操作系统已经为你提供了。

With memory allocation, the only time you can gain performance by not simply using the heap is by giving up some other aspect (allocation overhead, allocation lifetime....) which is not important to your specific application. 通过内存分配，您可以通过不仅仅使用堆来获得性能的唯一时间是放弃一些其他方面（分配开销，分配生命周期......），这对您的特定应用程序并不重要。

For example, in our application we had a requirement for many allocations of less than 1KB but these allocations were used only for very short periods of time (milliseconds). 例如，在我们的应用程序中，我们要求许多分配小于1KB，但这些分配仅用于非常短的时间段（毫秒）。 To optimize the app, I used Boost Pool library but extended it so that my "allocator" actually contained a collection of boost pool objects, each responsible for allocating one specific size from 16 bytes up to 1024 (in steps of 4). 为了优化应用程序，我使用了Boost Pool库但扩展了它，以便我的“allocator”实际上包含一组boost池对象，每个对象负责分配一个特定大小，从16个字节到1024个（步长为4）。 This provided almost free (O(1) complexity) allocation/free of these objects but the catch is that a) memory usage is always large and never goes down even if we don't have a single object allocated, b) Boost Pool never frees the memory it uses (at least in the mode we are using it in) so we only use this for objects which don't stick around very long. 这提供了几乎免费（O（1）复杂性）分配/免除这些对象，但问题是：a）内存使用总是很大，即使我们没有分配单个对象也不会失败，b）Boost Pool never释放它使用的内存（至少在我们使用它的模式中），所以我们只将它用于不会长时间停留的对象。

So which aspect(s) of normal memory allocation are you willing to give up in your app? 那么您愿意在应用程序中放弃正常内存分配的哪个方面？

Depending on the system there are a couple of ways to do it. 根据系统的不同，有几种方法可以做到这一点。

Try to avoid fragmentation in the first place, if you allocate blocks in powers of 2 you have less a chance of causing this kind of fragmentation. 首先尝试避免碎片，如果你以2的幂分配块，你就不太可能造成这种碎片。 There are a couple of other ways around it but if you ever reach this state then you just OOM at that point because there are no delicate ways of handling it other than killing the process that asked for memory, blocking until you can allocate memory, or returning NULL as your allocation area. 还有其他几种方法，但是如果你达到了这个状态，那么你就是OOM，因为除了杀死要求内存的进程，阻塞直到你可以分配内存之外，没有其他方法可以处理它，或者返回NULL作为您的分配区域。

Another way is to pass pointers to pointers of your data(ex: int **). 另一种方法是将指针传递给数据指针（例如：int **）。 Then you can rearrange memory beneath the program (thread safe I hope) and compact the allocations so that you can allocate new blocks and still keep the data from old blocks (once the system gets to this state though that becomes a heavy overhead but should seldom be done). 然后你可以重新安排程序下面的内存（我希望线程安全）并压缩分配，这样你就可以分配新的块并仍然保留旧块的数据（一旦系统达到这种状态，虽然这会成为一个沉重的开销，但很少应该很少完成）。

There are also ways of "binning" memory so that you have contiguous pages for instance dedicate 1 page only to allocations of 512 and less, another for 1024 and less, etc... This makes it easier to make decisions about which bin to use and in the worst case you split from the next highest bin or merge from a lower bin which reduces the chance of fragmenting across multiple pages. 还有一些“分箱”内存的方法，以便你有连续的页面，例如专用1页只分配512或更少，另一个1024和更少，等...这使得更容易决定使用哪个bin在最坏的情况下，您从下一个最高的bin分离或从较低的bin合并，这样可以减少跨多个页面分段的可能性。

为您经常分配的对象实现对象池会大大降低碎片，而无需更改内存分配器。

It would be helpful to know more exactly what you are actually trying to do, because there are many ways to deal with this. 了解更多您正在尝试做的事情会很有帮助，因为有很多方法可以解决这个问题。
But, the first question is: is this actually happening, or is it a theoretical concern? 但是，第一个问题是：这实际发生了，还是理论上的问题？

One thing to keep in mind is you normally have a lot more virtual memory address space available than physical memory, so even when physical memory is fragmented, there is still plenty of contiguous virtual memory. 要记住的一件事是，您通常拥有比物理内存更多的虚拟内存地址空间，因此即使物理内存碎片化，仍然有大量连续的虚拟内存。 (Of course, the physical memory is discontiguous underneath but your code doesn't see that.) （当然，物理内存在下面是不连续的，但你的代码看不到。）

I think there is sometimes unwarranted fear of memory fragmentation, and as a result people write a custom memory allocator (or worse, they concoct a scheme with handles and moveable memory and compaction). 我认为有时候对内存碎片有无根据的恐惧，因此人们会编写一个自定义内存分配器（或者更糟糕的是，他们用一个句柄和可移动的内存和压缩来编写一个方案）。 I think these are rarely needed in practice, and it can sometimes improve performance to throw this out and go back to using malloc. 我认为这些在实践中很少需要，它有时可以提高性能，然后再使用malloc。

write the pool to operate as a list of allocations, you can then extended and destroyed as needed. 编写池作为分配列表运行，然后可以根据需要进行扩展和销毁。 this can reduce fragmentation. 这可以减少碎片。
and/or implement allocation transfer (or move) support so you can compact active allocations. 和/或实现分配传输（或移动）支持，以便您可以压缩活动分配。 the object/holder may need to assist you, since the pool may not necessarily know how to transfer types itself. 对象/持有者可能需要帮助您，因为池可能不一定知道如何传输类型本身。 if the pool is used with a collection type, then it is far easier to accomplish compacting/transfers. 如果池与集合类型一起使用，那么完成压缩/传输要容易得多。