简体   繁体   中英

Efficiently Allocate Memory in Kernel

I want to write a kernel module where I am getting TCP/IP packets near 8 mbps. I have to store these packet for 500ms duration. Later these packets should be forwarded sequentially. And these should be done for 30 members. What should be best approach to implement? Should I use kmalloc for once (kmalloc(64000000, GFP_ATOMIC) ? Because each time if I do kmalloc and kfree it will take time, leading to a performance issue. Also if I allocate memory in kernel in one shot will the linux kernel will allow me to do that?

I once wrote a kernel module processing packets on a 10Gbs link. I used vmalloc to allocate about 1GByte of continuous (virtual) memory to put a static size hash table into it to perform connection tracking ( code ).

If you know how much memory you need, I recommend to pre-allocate it. This has two advantages

  • It is fast (no mallocing/freeing at runtime)
  • You don't have to think of a strategy if kmalloc(_, GFP_ATOMIC) can not return you memory. This can actually happen quite often under heavy load.

Disadvantage

  • You might allocate more memory then necessary.

So, for writing a special-purpose kernel module, please pre-allocate as much memory as you can get ;)

If you write a kernel module for commodity hardware used by many novice users, it would be nice to allocate memory on demand (and waste less memory).


Where do you allocate memory? GFP_ATOMIC can only return a very small amount of memory and should only be used if your memory allocation cannot sleep. You can use GFP_KERNEL when it is safe to sleep, eg, not in interrupt context. See this question for more. It is safe to use vmalloc during module initialization to pre-allocate all you memory.

Using vmalloc as in corny 's answer will be faster with Linux kernel 5.2 ( released Q3 2019 ), because of kernel changes .

From Michael Larabel :

The Linux kernel's vmalloc code has the potential of performing much faster on Linux 5.2, particularly with embedded devices.
Vmalloc is used for allocating contiguous memory in the virtual address space and saw a nice optimization merged today on the expected final day of the Linux 5.2 merge window.

As part of a pull (commit cb6f873) merged minutes ago from Andrew Morton are "large changes to vmalloc, yielding large performance benefits."

The principal change to the vmalloc code is keeping track of free blocks for allocation .
Currently an allocation of the new VA area is done over busy list iteration until a suitable hole is found between two busy areas. Therefore each new allocation causes the list being grown. Due to long list and different permissive parameters an allocation can take a long time on embedded devices(milliseconds).

This patch organizes the vmalloc memory layout into free areas of the VMALLOC_START - VMALLOC_END range. It uses a red-black tree that keeps blocks sorted by their offsets in pair with linked list keeping the free space in order of increasing addresses.

With this patch from Uladzislau Rezki , calling vmalloc() can take up to 67% less time compared to the behavior on Linux 5.1 and prior, at least with tests done by the developer under QEMU.

The commit, as mirrored on GitHub, is here :

It introduces a red-black tree :

/*
 * This augment red-black tree represents the free vmap space.
 * All vmap_area objects in this tree are sorted by va->va_start
 * address. It is used for allocation and merging when a vmap
 * object is released.
 *
 * Each vmap_area node contains a maximum available free block
 * of its sub-tree, right or left. Therefore it is possible to
 * find a lowest match of free area.
 */

With the function :

/*
 * Merge de-allocated chunk of VA memory with previous
 * and next free blocks. If coalesce is not done a new
 * free area is inserted. If VA has been merged, it is
 * freed.
 */
static __always_inline void
merge_or_add_vmap_area(struct vmap_area *va,
    struct rb_root *root, struct list_head *head)

/*
 * Find a place in the tree where VA potentially will be
 * inserted, unless it is merged with its sibling/siblings.
 */

/*
 * Get next node of VA to check if merging can be done.
 */

/*
 * start            end
 * |                |
 * |<------VA------>|<-----Next----->|
 *                  |                |
 *                  start            end
 */
...
/*
 * start            end
 * |                |
 * |<-----Prev----->|<------VA------>|
 *                  |                |
 *                  start            end
 */

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM