简体   繁体   English

具有子图聚合的递归查询(任意深度)

[英]Recursive query with sub-graph aggregation (arbitrary depth)

I asked a question earlier about aggregating quantities along a graph. 我问一个问题前面关于沿着图形聚集量。 The two answers provided worked well, but now I am trying to extend the Cypher query it to a graph of variable depth. 提供的两个答案运作良好,但现在我试图将Cypher查询扩展到可变深度的图表。

To summarize we start of with a bunch of leaf stores which all are associated with a particular supplier, which is a property on the Store node. 总结一下,我们从一堆叶子商店开始,这些叶子商店都与特定供应商相关联,这是Store节点上的一个属性。 Inventory is then moved along to other stores and the proportion from each supplier corresponds to their contribution to the original store. 然后将库存移至其他商店,每个供应商的比例对应于他们对原始商店的贡献。

So for node B02 , S2 contributed 750/1250 = 60% and S3 contributed 40% . 因此对于节点B02S2贡献750/1250 = 60%并且S3贡献40% We then move 600 units our of B02 of which 60% belongs to S2 and 40% to S3 and so on. 然后,我们600台我们B02 ,其中60%属于S240%S3等。

在此输入图像描述

What we want to know what percentage of the final 700 units into D01 belong to each supplier. 我们想知道D01最终700个单位的百分比属于每个供应商。 Where suppliers with the same name are the same supplier. 供应商名称相同的供应商。 So for the above graph we expect: 因此,对于上图,我们期望:

S1, 38.09 S1,38.09
S2, 27.61 S2,27.61
S3, 34.28 S3,34.28

I've prepared a graph using this Cypher script: 我使用这个Cypher脚本编写了一个图表:

CREATE (A01:Store {Name: 'A01', Supplier: 'S1'})
CREATE (A02:Store {Name: 'A02', Supplier: 'S1'})
CREATE (A03:Store {Name: 'A03', Supplier: 'S2'})
CREATE (A04:Store {Name: 'A04', Supplier: 'S3'})
CREATE (A05:Store {Name: 'A05', Supplier: 'S1'})
CREATE (A06:Store {Name: 'A06', Supplier: 'S1'})
CREATE (A07:Store {Name: 'A07', Supplier: 'S2'})
CREATE (A08:Store {Name: 'A08', Supplier: 'S3'})

CREATE (B01:Store {Name: 'B01'})
CREATE (B02:Store {Name: 'B02'})
CREATE (B03:Store {Name: 'B03'})
CREATE (B04:Store {Name: 'B04'})

CREATE (C01:Store {Name: 'C01'})
CREATE (C02:Store {Name: 'C02'})

CREATE (D01:Store {Name: 'D01'})

CREATE (A01)-[:MOVE_TO {Quantity: 750}]->(B01)
CREATE (A02)-[:MOVE_TO {Quantity: 500}]->(B01)
CREATE (A03)-[:MOVE_TO {Quantity: 750}]->(B02)
CREATE (A04)-[:MOVE_TO {Quantity: 500}]->(B02)
CREATE (A05)-[:MOVE_TO {Quantity: 100}]->(B03)
CREATE (A06)-[:MOVE_TO {Quantity: 200}]->(B03)
CREATE (A07)-[:MOVE_TO {Quantity: 50}]->(B04)
CREATE (A08)-[:MOVE_TO {Quantity: 450}]->(B04)

CREATE (B01)-[:MOVE_TO {Quantity: 400}]->(C01)
CREATE (B02)-[:MOVE_TO {Quantity: 600}]->(C01)
CREATE (B03)-[:MOVE_TO {Quantity: 100}]->(C02)
CREATE (B04)-[:MOVE_TO {Quantity: 200}]->(C02)

CREATE (C01)-[:MOVE_TO {Quantity: 500}]->(D01)
CREATE (C02)-[:MOVE_TO {Quantity: 200}]->(D01)

The current query is this: 目前的查询是这样的:

MATCH (s:Store { Name:'D01' })
MATCH (s)<-[t:MOVE_TO]-()<-[r:MOVE_TO]-(supp)
WITH t.Quantity as total, collect(r) as movements
WITH total, movements, reduce(totalSupplier = 0, r IN movements | totalSupplier + r.Quantity) as supCount
UNWIND movements as movement
RETURN startNode(movement).Supplier as Supplier, round(100.0*movement.Quantity/supCount) as pct

I am trying to use recursive relationships, something along the lines of this: 我正在尝试使用递归关系,类似于以下内容:

MATCH (s)<-[t:MOVE_TO]-()<-[r:MOVE_TO*]-(supp)

however that gives multiple paths to the end node and I need to aggregate the inventory at each node I think. 但是,它提供了到终端节点的多条路径,我需要在每个节点聚合库存。

As i said before i enjoyed this question. 正如我之前所说,我喜欢这个问题。 I know you already accepted an answer, but I decided to post my final response as it also returns the percentile without client effort ( which means you can also do a SET on the nodes to update the value in the db when you need to ) and of course if for any other reason as one i can come back to :) here is the link to the console example 我知道你已经接受了答案,但是我决定发布我的最终答案,因为它也会在没有客户努力的情况下返回百分位数(这意味着您还可以在节点上执行SET以在需要时更新数据库中的值)并且当然,如果由于任何其他原因,我可以回来:)这里是控制台示例的链接

it returns a row with the store name, sum moved to it from all suppliers and the percentile of each supplier 它返回一个带有商店名称的行,从所有供应商处移动到的总和以及每个供应商的百分位数

MATCH p =s<-[:MOVE_TO*]-sup
WHERE HAS (sup.Supplier) AND NOT HAS (s.Supplier)
WITH s,sup,reduce(totalSupplier = 0, r IN relationships(p)| totalSupplier + r.Quantity) AS TotalAmountMoved
WITH sum(TotalAmountMoved) AS sumMoved, collect(DISTINCT ([sup.Supplier, TotalAmountMoved])) AS MyDataPart1,s
WITH reduce(b=[], c IN MyDataPart1| b +[{ Supplier: c[0], Quantity: c[1], Percentile: ((c[1]*1.00))/(sumMoved*1.00)*100.00 }]) AS MyData, s, sumMoved
RETURN s.Name, sumMoved, MyData

I can't think my way through a solution in pure cypher because I don't think you can do recursion like this in cypher. 我无法通过纯密码的解决方案来思考,因为我不认为你可以在cypher中做这样的递归。 You can use cypher to return you all of the data in the tree in a simple way so that you can compute it in your favorite programming language, however. 您可以使用cypher以简单的方式返回树中的所有数据,以便您可以使用自己喜欢的编程语言计算它。 Something like this: 像这样的东西:

MATCH path=(source:Store)-[move:MOVE_TO*]->(target:Store {Name: 'D01'})
WHERE source.Supplier IS NOT NULL
RETURN
  source.Supplier,
  reduce(a=[], move IN relationships(path)| a + [{id: ID(move), Quantity: move.Quantity}])

This will return you the id and the quantity for each of the relationships along each path. 这将返回每条路径上每个关系的id和数量。 Then you could process that client-side (perhaps first converting it into a nested data structure?) 然后你可以处理该客户端(可能首先将其转换为嵌套数据结构?)

This query generates the correct results for any arbitrary graph that conforms to the model described in the question. 此查询为符合问题中描述的模型的任意图形生成正确的结果。 (When Store x moves merchandise to Store y, it is assumed that the Supplier percentages of the moved merchandise is the same as for Store x.) (当Store x将商品移至Store y时,假设商品的Supplier百分比与Store x相同。)

However, this solution does not consist of just a single Cypher query (since that may not be possible). 但是,此解决方案不仅包含单个Cypher查询(因为这可能是不可能的)。 Instead, it involves multiple queries, one of which must be iterated until the calculations cascade through the entire graph of Store nodes. 相反,它涉及多个查询,其中一个必须迭代,直到计算级联通过Store节点的整个图形。 That iterated query will clearly tell you when to stop iterating. 该迭代查询将清楚地告诉您何时停止迭代。 The other Cypher queries are needed to: prepare the graph for iteration, report the Supplier percentages for the "end" node(s), and clean up the graph (so that it is restored to the way it was before step 1, below). 需要其他Cypher查询:为迭代准备图表,报告“结束”节点的供应商百分比,并清理图表(以便将其恢复到下面步骤1之前的方式) 。

These queries could probably be further optimized. 这些查询可能会进一步优化。

Here are the required steps: 以下是必需的步骤:

  1. Prepare the graph for the iterative query (initializes the temporary pcts array for all starting Store nodes). 准备迭代查询的图形(初始化所有起始Store节点的临时pcts数组)。 This includes the creation of a singleton Suppliers node that has an array with all the supplier names. 这包括创建具有包含所有供应商名称的数组的单例Suppliers节点。 This is used to establish the order of the elements of the temporary pcts arrays, and to map those elements back to the correct supplier name. 这用于建立临时pcts数组元素的顺序,并将这些元素映射回正确的供应商名称。

     MATCH (store:Store) WHERE HAS (store.Supplier) WITH COLLECT(store) AS stores, COLLECT(DISTINCT store.Supplier) AS csup CREATE (sups:Suppliers { names: csup }) WITH stores, sups UNWIND stores AS store SET store.pcts = EXTRACT(i IN RANGE(0,LENGTH(sups.names)-1,1) | CASE WHEN store.Supplier = sups.names[i] THEN 1.0 ELSE 0.0 END) RETURN store.Name, store.Supplier, store.pcts; 

    Here is the result with the question's data: 以下是问题数据的结果:

     +---------------------------------------------+ | store.Name | store.Supplier | store.pcts | +---------------------------------------------+ | "A01" | "S1" | [1.0,0.0,0.0] | | "A02" | "S1" | [1.0,0.0,0.0] | | "A03" | "S2" | [0.0,1.0,0.0] | | "A04" | "S3" | [0.0,0.0,1.0] | | "A05" | "S1" | [1.0,0.0,0.0] | | "A06" | "S1" | [1.0,0.0,0.0] | | "A07" | "S2" | [0.0,1.0,0.0] | | "A08" | "S3" | [0.0,0.0,1.0] | +---------------------------------------------+ 8 rows 83 ms Nodes created: 1 Properties set: 9 
  2. Iterative query (run repeatedly until 0 rows are returned) 迭代查询(重复运行,直到返回0行)

     MATCH p=(s1:Store)-[m:MOVE_TO]->(s2:Store) WHERE HAS(s1.pcts) AND NOT HAS(s2.pcts) SET s2.pcts = EXTRACT(i IN RANGE(1,LENGTH(s1.pcts),1) | 0) WITH s2, COLLECT(p) AS ps WITH s2, ps, REDUCE(s=0, p IN ps | s + HEAD(RELATIONSHIPS(p)).Quantity) AS total FOREACH(p IN ps | SET HEAD(RELATIONSHIPS(p)).pcts = EXTRACT(parentPct IN HEAD(NODES(p)).pcts | parentPct * HEAD(RELATIONSHIPS(p)).Quantity / total) ) FOREACH(p IN ps | SET s2.pcts = EXTRACT(i IN RANGE(0,LENGTH(s2.pcts)-1,1) | s2.pcts[i] + HEAD(RELATIONSHIPS(p)).pcts[i]) ) RETURN s2.Name, s2.pcts, total, EXTRACT(p IN ps | HEAD(RELATIONSHIPS(p)).pcts) AS rel_pcts; 

    Iteration 1 result: 迭代1结果:

     +-----------------------------------------------------------------------------------------------+ | s2.Name | s2.pcts | total | rel_pcts | +-----------------------------------------------------------------------------------------------+ | "B04" | [0.0,0.1,0.9] | 500 | [[0.0,0.1,0.0],[0.0,0.0,0.9]] | | "B01" | [1.0,0.0,0.0] | 1250 | [[0.6,0.0,0.0],[0.4,0.0,0.0]] | | "B03" | [1.0,0.0,0.0] | 300 | [[0.3333333333333333,0.0,0.0],[0.6666666666666666,0.0,0.0]] | | "B02" | [0.0,0.6,0.4] | 1250 | [[0.0,0.6,0.0],[0.0,0.0,0.4]] | +-----------------------------------------------------------------------------------------------+ 4 rows 288 ms Properties set: 24 

    Iteration 2 result: 迭代2结果:

     +-------------------------------------------------------------------------------------------------------------------------------+ | s2.Name | s2.pcts | total | rel_pcts | +-------------------------------------------------------------------------------------------------------------------------------+ | "C02" | [0.3333333333333333,0.06666666666666667,0.6] | 300 | [[0.3333333333333333,0.0,0.0],[0.0,0.06666666666666667,0.6]] | | "C01" | [0.4,0.36,0.24] | 1000 | [[0.4,0.0,0.0],[0.0,0.36,0.24]] | +-------------------------------------------------------------------------------------------------------------------------------+ 2 rows 193 ms Properties set: 12 

    Iteration 3 result: 迭代3结果:

     +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | s2.Name | s2.pcts | total | rel_pcts | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | "D01" | [0.38095238095238093,0.27619047619047615,0.34285714285714286] | 700 | [[0.2857142857142857,0.2571428571428571,0.17142857142857143],[0.09523809523809522,0.01904761904761905,0.17142857142857143]] | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row 40 ms Properties set: 6 

    Iteration 4 result: 迭代4结果:

     +--------------------------------------+ | s2.Name | s2.pcts | total | rel_pcts | +--------------------------------------+ +--------------------------------------+ 0 rows 69 ms 
  3. List the non-zero Supplier percentages for the ending Store node(s). 列出结束Store节点的非零Supplier百分比。

     MATCH (store:Store), (sups:Suppliers) WHERE NOT (store:Store)-[:MOVE_TO]->(:Store) AND HAS(store.pcts) RETURN store.Name, [i IN RANGE(0,LENGTH(sups.names)-1,1) WHERE store.pcts[i] > 0 | {supplier: sups.names[i], pct: store.pcts[i] * 100}] AS pcts; 

    Result: 结果:

     +----------------------------------------------------------------------------------------------------------------------------------+ | store.Name | pcts | +----------------------------------------------------------------------------------------------------------------------------------+ | "D01" | [{supplier=S1, pct=38.095238095238095},{supplier=S2, pct=27.619047619047617},{supplier=S3, pct=34.285714285714285}] | +----------------------------------------------------------------------------------------------------------------------------------+ 1 row 293 ms 
  4. Clean up (remove all the temporary pcts props and the Suppliers node). 清理(​​删除所有临时pcts道具和Suppliers节点)。

     MATCH (s:Store), (sups:Suppliers) OPTIONAL MATCH (s)-[m:MOVE_TO]-() REMOVE m.pcts, s.pcts DELETE sups; 

    Result: 结果:

     0 rows 203 ms +-------------------+ | No data returned. | +-------------------+ Properties set: 29 Nodes deleted: 1 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM