简体   繁体   中英

How to avoid “heap pointer spaghetti” in dynamic graphs?

The generic problem

Suppose you are coding a system that consists of a graph, plus graph rewrite rules that can be activated depending on the configuration of neighboring nodes. That is, you have a dynamic graph that grows/shrinks unpredictably during runtime. If you naively use malloc , new nodes are going to be allocated in random positions in memory; after enough time, your heap will be a pointer spaghetti, giving you terrible cache efficiency. Is there any lightweight, incremental technique to make nodes that wire together stay close together in memory ?

What I tried

The only thing I could think of is embedding the nodes in a cartesian space with some physical elastic simulation that repulsed/attracted nodes. That'd keep wired nodes together, but looks silly and I guess the overhead of the simulation would be bigger than the cache efficiency speedup.

The solid example

This is the system I'm trying to implement. This is a brief snippet of the code I'm trying to optimize in C. This repo is a prototypal, working implementation in JS, with terrible cache efficiency (and of the language itself). This video shows the system in action graphically.

What you are looking to solve is the Linear Arrangement Problem . Perfect solutions are considered to be NP-hard, but some good approximations exist. Here is a paper which should be a good place to start.

You might look at this in terms of halfspace garbage collection. This isn't hard to implement (I've done it for an interpreter), particularly since you're only doing it for fixed size node structures. Allocate from one large block (called a halfspace) of memory. When it gets too full or fragmented, stop and copy everything to the other (which you can also make bigger). The trick is updating all the pointers. For this there is a very elegant and efficient algorithm called scan copy. There's a nice discussion of it at Cornell . It essentially traverses the graph breadth first, copying as it goes, without any extra space other than what you're copying. A nice property of the algorithm is that breadth first levels end up adjacent after each copy. If this is a good enough level of locality, you'll get it very efficiently with this method.

If you're really concerned about the layout of memory, it might be worthwhile to manage it yourself.

You can malloc a large block of memory at startup, then you allocate space out from that block. You'll need a separate structure to keep track of what has and what hasn't been allocated. If you know that all allocated structures are of a certain size that can simplify allocated/free space management, ie an array of indexes, otherwise you could use a linked list of pointers in the free space. Given that you'll likely be allocating structs one at a time, you probably don't need to worry about keeping track of the smallest and/or largest contiguous block of free space.

One thing you'll need to be careful of is alignment. Again, if you'll always be allocating memory in multiples of the size of a single struct, that makes things easier, otherwise it's probably a good idea to ensure that all allocations start at an 4 byte boundary, ie the difference between the address you allocate and the start address received from malloc is a multiple of 4.

You can pass additional parameters to your custom allocation functions to give hints about where the block should be placed, such as the address of one or more nearby nodes.

This can be viewed as a graph partitioning problem, where you're trying to cluster linked graph nodes on the same memory block. METIS is a good graph partitioning algorithm that is probably not appropriate for your use case because it requires global operations across the entire graph, however two distributed graph partitioning algorithms that may be modified to be applicable for your use case are DIDIC and Ja-Be-Ja - the former attempts to minimize the number of edges that cross partitions without respect to partition size, while the latter attempts to create equally sized partitions. Both algorithms only require local knowledge of the graph to cluster each node, so if you've got any spare cycles you can use them to incrementally rebalance the graph. Fennel is a related algorithm that operates on streaming graphs, so eg you could use Fennel or a similar algorithm when initially allocating a graph node, and then use DIDIC/Ja-Be-Ja when rebalancing the graph.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM