简体   繁体   中英

Efficient allocation of dynamic arrays within mmap'ed memory

I have a very large (fixed at runtime, around 10 - 30 million) number of arrays. Each array is of between 0 and 128 elements that are each 6 bytes.

I need to store all the arrays in mmap'ed memory (so I can't use malloc), and the arrays need to be able to grow dynamically (up to 128 elements, and the arrays never shrink).

I implemented a naive approach of having an int array representing the state of each block of 6 bytes in the mmap'ed memory. A value of 0xffffffff at an offset represents the corresponding offset in the mmap'ed memory being free, any other value is the id of the array (which is needed for defragmentation of the blocks in my current implementation, blocks can't be moved without knowing the id of their array to update other data structures). On allocation and when an array outgrows its allocation it would simply iterate until it found enough free blocks, and insert at the corresponding offset.

This is what the allocation array and mmap'ed memory look like, kindof:

| 0xffffffff | 0xfffffff |    1234    |    1234    | 0xffffffff | ...
-----------------------------------------------------------------
|    free    |   free    |array1234[0]|array1234[1]|    free    | ...


This approach though has a memory overhead of offset of furthest used block in mmap'ed memory x 4 (4 bytes ber int).

What better approaches are there for this specific case?

My ideal requirements for this are:

  • Memory overhead (any allocation tables + unused space) <= 1.5 bits per element + 4*6 bytes per array
  • O(1) allocation and growing of arrays

Boost.Interprocess seems to have a neat implementation of managed memory-mapped files , with provisions similar to malloc/free but for mapped files (ie you have a handle to a suitably-large memory-mapped file and you can ask the library to sub-allocate an unused part of the file for something, like an array). From the documentation:

Boost.Interprocess offers some basic classes to create shared memory objects and file mappings and map those mappable classes to the process' address space.

However, managing those memory segments is not not easy for non-trivial tasks. A mapped region is a fixed-length memory buffer and creating and destroying objects of any type dynamically, requires a lot of work, since it would require programming a memory management algorithm to allocate portions of that segment. Many times, we also want to associate names to objects created in shared memory, so all the processes can find the object using the name.

Boost.Interprocess offers 4 managed memory segment classes:

  • To manage a shared memory mapped region (basic_managed_shared_memory class).
  • To manage a memory mapped file (basic_managed_mapped_file).
  • To manage a heap allocated (operator new) memory buffer (basic_managed_heap_memory class).
  • To manage a user provided fixed size buffer (basic_managed_external_buffer class).

The most important services of a managed memory segment are:

  • Dynamic allocation of portions of a memory the segment.
  • Construction of C++ objects in the memory segment. These objects can be anonymous or we can associate a name to them.
  • Searching capabilities for named objects.
  • Customization of many features: memory allocation algorithm, index types or character types.
  • Atomic constructions and destructions so that if the segment is shared between two processes it's impossible to create two objects associated with the same name, simplifying synchronization.

How many mmap'ed areas can you afford? If 128 is ok then I'd created 128 areas corresponding to the all possible sizes of your arrays. And ideally a linked list of free entries for each area. In this case you will get fixed size of the record for each area. And growing an array from N to N + 1 is an operation of moving the data from area[N] to area[N + 1] at the end (if the linked list of empty entries for N + 1 is empty) or in an empty slot if not. For area[N] the slot removed is added to its list of free entries

UPDATE: The linked list can be embed in the main structures. So no extra allocations is needed, the first field (int) inside every possible record (from size 1 to 128) can be an index to the next free entry. For allocated entries it's always void (0xffffffff), but if an entry is free then this index becomes a member of the corresponding linked chain.

I devised and ultimately went with a memory allocation algorithm that just about lives up to my requirements, with O(1) amortised, very little fragmentation and very little overhead. Feel free to comment and I'll detail it when I get a chance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM