简体   繁体   中英

Which data structure works best in shared memory scenario and fast lookup

I am still at a conceptual stage of a project. Yet to start code implementation. A subtask is this :

2 Processes will request data from a commonly accessed DLL. This DLL would be storing this data in a buffer in memory. If I just instantiate a structure within the DLL and store data in it, then each process instance will have a seperate structure and the data won't be common. So I need to have a shared memory implementation. Now another requirement that I have is of fast lookup time within the data. I am not sure how an AVL tree can be stored within a shared memory space. Is there an implementation available on the internet for an AVL tree/Hashmap that can be stored in shared memory space ? Also, is this the right approach to the problem ? Or should I be using something else altogether ?

TIA!

Whether this is the right approach depends on various factors, such as how expensive the data is to produce, whether the processes need to communicate with each other concerning the data, and so on. The rest of this answer assumes that you really do need a lookup structure in shared memory.

You can use any data structure, provided that you can allocate storage for both your data and the data structure's internals in your shared memory space. This typically means that you won't be able to use malloc for it, since each process' heap usually remains private. You will need your own custom allocator.

Let's say you chose AVL trees. Here's a library that implements them: https://github.com/fbuihuu/libtree . It looks like in this library, the "internal" AVL node data is stored intrusively in your "objects." Intrusive means that you reserve fields to be used by the library when declaring your object struct . So, as long as you allocate space for your objects in shared memory, using your custom allocator, and also allocate space for the root tree struct there as well, the whole tree should be accessible to multiple processes. You just have to make sure that the shared memory itself is mapped to the same address range in each process.

If you used a non-intrusive AVL implementation, meaning that each node is represented by an internal struct which then points to a separate struct containing your data, the library or your implementation would have to allow you to specify the allocator for the internal struct somehow, so that you could make sure the space will be allocated in shared memory.

As for how to write the custom allocator, that really depends on your usage and the system. You need to consider if you will ever need to "resize" the shared memory region, whether the system allows you to do that, whether you will allocate only fixed-width blocks inside the region, or you need to support blocks with arbitrary length, whether it's acceptable to spread your data structures over multiple shared memory regions, how your processes can synchronize and communicate, and so on. If you go this route, you should ask a new question on the topic. Be sure to mention what system you are using (Windows?) and what your constraints are.

EDIT

Just to further discourage you from doing this unless it's necessary: if, for example, your data is expensive to produce but you don't care whether the processes build up their own independent lookup structures once the data is available to them, then you can, for example, have the DLL write the data to a simple ring buffer in shared memory, and the rest of the code take it from there. Building up two AVL trees isn't really a problem unless they are going to be very large.

Also, if you only care about concurrency, and it's not important for there to be two processes, you may be able to make them both threads of one process.

In the case of Windows, Microsoft's recommended functions return what can be different pointer values to shared memory for each process. This means that within the shared memory, offsets (from the start of shared memory) have to be used instead of pointers. For example in a linked list, there is a next offset instead of a next pointer. You may want to create macros to convert offsets to pointers, and pointers to offsets.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM