简体   繁体   English

Gremlin:如何在有向无环图中有效地找到“根”?

[英]Gremlin: How to efficiently find “roots” in directed acyclic graph?

I'm trying to write a gremlin query that efficiently solves the "confluent rivers" problem (for a lack of a better name, maybe there is one in graph theory?).我正在尝试编写一个有效解决“汇合河流”问题的 gremlin 查询(由于缺乏更好的名称,也许图论中有一个?)。 Here's an example:这是一个例子:

在此处输入图像描述

The task: given one of the root nodes, deliver a map containing the IDs of downstream nodes as keys, with all of their river root IDs (ie the end nodes reached by going all paths upwards again from the current node) as values.任务:给定一个根节点,交付一个包含下游节点 ID 作为键的 map,其所有河流根 ID(即从当前节点再次向上走所有路径到达的末端节点)作为值。

For instance, in the example graph above, and for root node 0, the result should be:例如,在上面的示例图中,对于根节点 0,结果应该是:

{
 "0": ["0"],
 "1": ["0", "4"],
 "2": ["0", "5", "8"],
 "3": ["0", "4", "5", "8"],
 "6": ["0", "4"]
}

I'm particularly worried here about walking paths multiple times.我在这里特别担心多次步行道。 For example, after calculating the roots of "2", I'd like to reuse that result for computing the roots of its downstream node "3".例如,在计算“2”的根之后,我想重用该结果来计算其下游节点“3”的根。

Any clues how that might work for a large directed acyclic graph?任何线索如何适用于大型有向无环图?

Given your diagram we can create the following graph.给定您的图表,我们可以创建以下图表。

g.addV('0').as('0').
  addV('1').as('1').
  addV('2').as('2').
  addV('3').as('3').
  addV('4').as('4').
  addV('5').as('5').
  addV('6').as('6').
  addV('7').as('7').
  addV('8').as('8').
  addE('link').from('0').to('1').
  addE('link').from('0').to('2').
  addE('link').from('1').to('6').
  addE('link').from('1').to('3').
  addE('link').from('2').to('3').
  addE('link').from('4').to('1').
  addE('link').from('5').to('7').
  addE('link').from('7').to('2').
  addE('link').from('8').to('7').iterate()  

The query below starts at '0' and finds all the leaf nodes and then works backwards to find all the roots.下面的查询从“0”开始,查找所有叶节点,然后向后查找所有根。 The output does not include the starting node ('0') but if necessary the query can be tweaked to include that. output 不包括起始节点 ('0'),但如有必要,可以调整查询以包括该节点。

gremlin>  g.V().hasLabel('0').
......1>        repeat(out()).emit().
......2>        until(__.not(out())).dedup().
......3>        group().
......4>          by(label()).
......5>          by(repeat(__.in('link')).
......6>             until(__.not(__.in('link'))).
......7>             label().dedup().
......8>             fold())

==>[1:[0,4],2:[0,8,5],3:[0,8,4,5],6:[0,4]]       

If ordering is important the query can be updated to order the lists.如果排序很重要,则可以更新查询以对列表进行排序。

UPDATED更新

Adding an extra example that also includes '0' as a key in the results.添加一个额外的示例,该示例还包括“0”作为结果中的键。

gremlin>  g.V().hasLabel('0').
......1>        emit().repeat(out()).
......2>        until(__.not(out())).dedup().
......3>        group().
......4>          by(label()).
......5>          by(coalesce(
......6>              repeat(__.in('link')).
......7>              until(__.not(__.in('link'))).
......8>              label().dedup().
......9>              fold()))

==>[0:[],1:[0,4],2:[0,5,8],3:[0,4,5,8],6:[0,4]]   
  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM