简体   繁体   English

用Java表示直接非循环图的好设计模式/结构是什么?

[英]What is a good design pattern/structure to represent Direct Acyclic Graph in Java?

I need to store a reasonably large Direct Acyclic Graph in Java (order of 100,000 nodes, depth between 7 and 20, irregular shaped, average depth 13). 我需要在Java中存储一个相当大的直接非循环图(100,000个节点的顺序,深度在7到20之间,不规则形状,平均深度13)。

What would be the best-performing data structure(s) to store it if the predominant operation I need after building the data structure is: 如果在构建数据结构后我需要的主要操作是什么,那么存储它的最佳性能数据结构是:

  • 99% operations: Find a full set of accendant paths (from the root down to a given node) 99%的操作:查找一整套的重音路径(从根到目的节点)
  • 1% operations: Find all children, or more often, all ancestors, of a given node. 1%操作:查找给定节点的所有子节点,或更常见的是所有祖先节点。

As can be obvious, I'd like the first operation to be O(1) if possible, as opposed to O(Average-Depth) 很明显,如果可能的话,我希望第一个操作是O(1),而不是O(平均深度)

Please note that for the purposes of this question, the data structure is write-once: after I build it from a list of nodes and vertices, the graph topology will never change . 请注意,出于此问题的目的,数据结构是一次性写入:在我从节点和顶点列表构建之后,图形拓扑将永远不会改变


My naive implementation would be to store it as a combination of: 我天真的实现是将它存储为以下组合:

HashMap<Integer, Integer[]> childrenPerParent;
HashMap<Integer, Integer[]> ascendantPaths; 

Eg I store, for each node: a list of children of that node; 例如,我为每个节点存储:该节点的子节点列表; and separately, a set of paths to the root from that node. 并且分别是从该节点到根的一组路径。

Downside: This seems very wasteful as far as space (we basically store each of the inner graph nodes multiples of multiples of times in the ascendantPaths - eg given size estimates, we would store extra 100,000 * 13 = 1,3Million node copies in ascendantPaths , each of which is an object to be created and stored ) 缺点:就空间而言,这似乎非常浪费(我们基本上将每个内部图形节点存储在上升路径中的倍数倍 - 例如,给定大小估计,我们将在ascendantPaths路径中存储额外的100,000 * 13 = 1,3Million节点副本,每个都是要创建和存储的对象)

I would recommend using Neo4J . 我建议使用Neo4J It's a graph database implemented in Java with a lot of low-level optimizations (eg, node and edge attributes are stored in their own blocks so that node identities and their edges can be packed), and it mmaps the on-disk database. 它是一个用Java实现的图形数据库,具有许多低级优化(例如,节点和边缘属性存储在它们自己的块中,以便可以打包节点标识及其边缘),并且它可以映射磁盘数据库。 Following an edge is independent of the number of nodes in the graph or edges on the origin. 跟随边缘与图形中的节点数或原点边缘无关。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM