简体   繁体   中英

Use of malloc in Real Time application

We had an issue with one of our real time application. The idea was to run one of the threads every 2ms (500 Hz). After the application ran for half or an hour so.. we noticed that the thread is falling behind.

After a few discussions, people complain about the malloc allocations in the real time thread or root cause is malloc allocations.

I am wondering that, is it always a good idea to avoid all dynamic memory allocations in the real time threads?

Internet has very few resource on this ? If you can point to some discussion that would we great too..

Thanks

First step is to profile the code and make sure you understand exactly where the bottleneck is. People are often bad at guessing bottlenecks in code, and you might be surprised with the findings. You can simply instrument several parts of this routine yourself and dump min/avg/max durations in regular intervals. You want to see the worst case (max), and if the average duration increases as the time goes by.

I doubt that malloc will take any significant portion of these 2ms on a reasonable microcontroller capable of running Linux; I'd say it's more likely you would run out of memory due to fragmentation, than having performance issues. If you have any other syscalls in your function, they will easily take an order of magnitude more than malloc .

But if malloc is really the problem, depending on how short-lived your objects are, how much memory you can afford to waste, and how much your requirements are known in advance, there are several approaches you can take:

  1. General purpose allocation ( malloc from your standard library, or any third party implementation): best approach if you have "more than enough" RAM, many short-lived objects, and no strict latency requirements

    • PROS: works for any object size out of the box, familiar interface, memory is shared dynamically, no need to "plan ahead" if memory is not an issue
    • CONS: slight performance penalty during allocation and/or deallocation, memory fragmentation issues when doing lots of allocations/deallocations of objects of different sizes, whether a run-time allocation will fail is less deterministic and cannot be easily mitigated at runtime
  2. Memory pool : best approach in most cases where memory is limited, low latency is required, and the object needs to live longer than a single block scope

    • PROS: allocation/deallocation time is guaranteed to be O(1) in any reasonable implementation, does not suffer from fragmentation, easier to plan its size in advance, failure to allocate at run-time is (likely) easier to mitigate
    • CONS: works for a single specific object size - memory is not shared between other parts of the program, requires a planning for the right size of the pool or risking potential waste of memory
  3. Stack based (automatic-duration) objects : best for smaller, short-lived objects (single block scope)

    • PROS: allocation and deallocation is done automatically, allows having optimum usage of RAM for the object's lifetime, there are tools which can sometimes do a static analysis of your code and estimate the stack size
    • CONS: objects limited to a single block scope - cannot share objects between interrupt invocations
  4. Individual statically allocated objects : best approach for long lived objects

    • PROS: no allocation whatsoever - all needed objects exist throughout the application life-cycle, no problems with allocation/deallocation
    • CONS: wastes memory if the objects should be short-lived

Even if you decide to go for memory pools all over the program, make sure you add profiling/instrumentation to your code. And then leave it there forever to see how the performance changes over time.

Being a realtime software engineer in the aerospace industry we see this question a lot. Even within our own engineers, we see that software engineers attempt to use non-realtime programming techniques they learned elsewhere or to use open-source code in their programs. Never allocate from the heap during realtime. One of our engineers created a tool that intercepts the malloc and records the overhead. You can see in the numbers that you cannot predict when the allocation attempt will take a long time. Even on very high end computers (72 cores, 256 GB RAM servers) running a realtime hybrid of Linux we record mallocs taking 100's of milliseconds. It is a system call which is cross-ring, so high overhead, and you don't know when you will get hit by garbage collection, or when it decides it must request another large chunk or memory for the task from the OS.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM