简体   繁体   English

Gremlin 查询以从 Cosmos 图中获取受影响的节点

[英]Gremlin query to get affected nodes from the Cosmos graph

We have a Cosmos graph database like below.我们有一个如下所示的 Cosmos 图数据库。 ie. IE。 A,B,C,... are nodes/vertices and edges are as shown by the arrows. A,B,C,... 是节点/顶点,边如箭头所示。

在此处输入图像描述

Each node/vertex represents a value in SQL table.每个节点/顶点代表 SQL 表中的一个值。 Process and the requirement are as follows.流程及要求如下。

  1. User modifies node A value in SQL table用户修改 SQL 表中的节点 A 值
  2. Gremlin query passes A into the Graph Gremlin 查询将 A 传递到 Graph
  3. Graph returns the following vertices in the below listed order图表按以下列出的顺序返回以下顶点
  4. C# app calculates the values of D,K,M,P nodes in the order and updates SQL table C# app按顺序计算D、K、M、P节点的值并更新SQL表
  • D = A+B+C D = A+B+C
  • K = F+E+D K = F+E+D
  • M = J+K M = J+K
  • P = L+M+N+O P = L+M+N+O

I tried the following query and it has over 3000 RUs which is very costly.我尝试了以下查询,它有超过 3000 个 RU,这是非常昂贵的。

g.V("A").emit().repeat(__.in('depends')).until(__.inE().count().is(0))

We need some help to optimise the query.我们需要一些帮助来优化查询。 thanks谢谢

UPDATE ===========更新 ============

OK, we can rebuild the graph in a single partition to reduce the RUs but we have a scenario where multiple nodes are affected, highlighted in red in the below picture, on the way up starting from A.好的,我们可以在单个分区中重建图形以减少 RU,但是我们有一个场景,即多个节点受到影响,在下图中以红色突出显示,从 A 开始。

Can someone help with a query to get the results in A, D, K, O, M, P order please?有人可以帮助查询以获取 A、D、K、O、M、P 顺序的结果吗? Logic to the query is all the child nodes should be listed before their parents查询的逻辑是所有子节点都应列在其父节点之前

g.addV('ddn').property('pk', 'pk').property(id, 'A').property('formula', 'A').
addV('ddn').property('pk', 'pk').property(id, 'B').property('formula', 'B').
addV('ddn').property('pk', 'pk').property(id, 'C').property('formula', 'C').
addV('ddn').property('pk', 'pk').property(id, 'D').property('formula', 'A+B+C').property('requires', "'A','B','C'").
addV('ddn').property('pk', 'pk').property(id, 'E').property('formula', 'E').
addV('ddn').property('pk', 'pk').property(id, 'F').property('formula', 'E').
addV('ddn').property('pk', 'pk').property(id, 'G').property('formula', 'H+I').property('requires', "'H','I'").
addV('ddn').property('pk', 'pk').property(id, 'H').property('formula', 'H').
addV('ddn').property('pk', 'pk').property(id, 'I').property('formula', 'I').
addV('ddn').property('pk', 'pk').property(id, 'J').property('formula', 'F+G').property('requires', "'F','G'").
addV('ddn').property('pk', 'pk').property(id, 'K').property('formula', 'D+E+F').property('requires', "'D','E','F'").
addV('ddn').property('pk', 'pk').property(id, 'L').property('formula', 'L').
addV('ddn').property('pk', 'pk').property(id, 'M').property('formula', 'J+K').
addV('ddn').property('pk', 'pk').property(id, 'N').property('formula', 'N').
addV('ddn').property('pk', 'pk').property(id, 'O').property('formula', 'A+K').property('requires', "'A','K'").
addV('ddn').property('pk', 'pk').property(id, 'P').property('formula', 'L+M+N+O').property('requires', "'L','M','N','O'").
V('D').addE('needs').to(V('A')).
V('D').addE('needs').to(V('B')).
V('D').addE('needs').to(V('C')).
V('G').addE('needs').to(V('H')).
V('G').addE('needs').to(V('I')).
V('K').addE('needs').to(V('D')).
V('K').addE('needs').to(V('E')).
V('K').addE('needs').to(V('F')).
V('J').addE('needs').to(V('F')).
V('J').addE('needs').to(V('G')).
V('O').addE('needs').to(V('A')).
V('O').addE('needs').to(V('K')).
V('M').addE('needs').to(V('J')).
V('M').addE('needs').to(V('K')).
V('P').addE('needs').to(V('L')).
V('P').addE('needs').to(V('M')).
V('P').addE('needs').to(V('N')).
V('P').addE('needs').to(V('O'))

在此处输入图像描述

I think the answer boils down to being able to sort the vertices traversed by their path length.我认为答案归结为能够对路径长度遍历的顶点进行排序。

gremlin> g.V("A").
......1>   emit().repeat(__.in('needs')).path().
......2>   group().
......3>     by(tail(local)).
......4>     by(count(local).fold()).
......5>   order(local).
......6>     by(select(values).tail(local)).
......7>   select(keys)
==>[v[A],v[D],v[K],v[M],v[O],v[P]]

I group() by the last element in the path() and transform each path in the group to its length with count(local) .我按 path() 中的最后一个元素group() ) 并使用count(local)将组中的每个路径转换为其长度。 That allows me to order() the results by the longest path for each vertex.这允许我按每个顶点的最长路径order()结果。

Note that I don't think you need until(__.inE().count().is(0)) because you're just traversing to path exhaustion in either case.请注意,我认为您不需要until(__.inE().count().is(0))因为在任何一种情况下您都只是遍历路径耗尽。 Also, take care with __.inE().count().is(0) as you end up counting all the edges just to detect a count of zero.此外,请注意__.inE().count().is(0) ,因为您最终会计算所有边以检测零计数。 Most graphs should optimize that to just until(inE()) , but it's always best to be explicit in my opinion.大多数图表应该将其优化为仅until(inE()) ,但在我看来,最好是明确的。 That said, you need to be sure of your data structures when using repeat() - it takes just one edge of bad data to send your traversal into an infinity of traversing.也就是说,在使用repeat()时,您需要确定您的数据结构——只需要一条错误数据的边缘就可以将您的遍历发送到无限次遍历。 Consider some kind of upper bound to your repeat() that makes sense for your data so that the loop will terminate at some point.考虑某种对您的数据有意义的repeat()上限,以便循环将在某个点终止。

Here is an alternative which actually might be better since it doesn't bother to hold all the counts in the Map after the group() :这是一个实际上可能更好的替代方法,因为它不需要在group()之后保存Map中的所有计数:

gremlin> g.V("A").
......1>   emit().repeat(__.in('needs')).path().
......2>   group().
......3>     by(tail(local)).
......4>     by(count(local).order(local).tail(local)).
......5>   order(local).
......6>     by(values).
......7>   select(keys)
==>[v[A],v[D],v[K],v[M],v[O],v[P]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM