简体   繁体   English

Neo4j Cypher查询基于特征增量查找节点

[英]Neo4j Cypher query for finding nodes based on characteristics deltas

In my Neo4j/Spring Data Neo4j 4 project I have an entities: Product 在我的Neo4j / Spring Data Neo4j 4项目中,我有一个实体: Product

every Product has an Integer property - price 每个Product都有一个Integer属性 - price

For example I have a following products with prices: 例如,我有以下产品的价格:

Product1.price = 100
Product2.price = 305
Product3.price = 10000
Product4.price = 1000
Product5.price = 220

Products are not connected between each other with a relationships. 产品之间没有关系。

I need based on initial price value(Cypher query parameter) find a set(path) of products that distinct each other by a maximum price delta(Cypher query parameter). 我需要根据初始价格值(Cypher查询参数)找到一组(路径)产品,这些产品通过最大价格增量(Cypher查询参数)相互区分。

For example I need to find all products in Neo4j database starting from price = 50 and price delta = 150. As an output I expect to get the following products: 例如,我需要从价格= 50和价格delta = 150开始在Neo4j数据库中找到所有产品。作为输出,我希望得到以下产品:

Product1.price = 100
Product5.price = 220
Product2.price = 305

The computation looks like: 计算看起来像:

Starting point price = 50 so the first product should have a price not less than 50 and not more that 200(50+150). 起点价格= 50,因此第一个产品的价格应不低于50且不高于200(50 + 150)。 So based on this we found a product from our catalog with a price = 100. The second product should have a price not less than 100 and not more than 250(100+150).. This a product with a price = 220.. and the third price not less than 220 and not more 370. This is a product with a price = 305 所以基于此,我们从我们的目录中找到了一个价格= 100的产品。第二个产品的价格不应低于100且不超过250(100 + 150)..这个产品的价格= 220 ..第三个价格不低于220而不是370.这是一个价格= 305的产品

Could you please show a Cypher query that will find such kind of products. 能否请您展示一个可以找到这类产品的Cypher查询。

This is rather complex to perform in Cypher. 这在Cypher中执行起来相当复杂。 The only approach that comes to me is to use the REDUCE() function along with a CASE statement to conditionally add the product to the end of the list if it is within the delta of price of the last product in the list. 我遇到的唯一方法是使用REDUCE()函数和CASE语句,有条件地将产品添加到列表末尾,如果它在列表中最后一个产品的价格差值内。

Keep in mind that there is no way to short-circuit the processing of products with this approach. 请记住,使用这种方法无法使产品的处理短路。 If there are 1 million total products, and we find in the ordered list of products that only the first two products are within that delta pattern, this query will continue to check every single remaining one of those million products, although none of them will be added to our list. 如果总产品有100万,并且我们在有序的产品列表中发现只有前两种产品属于该增量模式,则此查询将继续检查这些百万种产品中的每一种产品,尽管它们都不会是添加到我们的列表中。

This query should work for you. 此查询应该适合您。

WITH {startPrice:50, delta:150} as params
MATCH (p:Product)
WHERE p.price >= params.startPrice
WITH params, p
ORDER BY p.price asc
WITH params, COLLECT(p) as products
WITH params, TAIL(products) as products, HEAD(products) as first
WHERE first.price <= params.startPrice + params.delta
WITH REDUCE(prods = [first], prod in products | 
  CASE WHEN prod.price <= LAST(prods).price + params.delta 
       THEN prods + prod 
       ELSE prods END) as products
RETURN products

The solution requires the transfer of an intermediate result during iteration. 该解决方案需要在迭代期间传输中间结果。 An interesting problem, because today cypher does not offer this possibility directly. 一个有趣的问题,因为今天cypher没有直接提供这种可能性。 As an exercise (sketch) use the apoc.periodic.commit procedure from APOC -library: 作为练习(草图)使用APOC -library中的apoc.periodic.commit程序:

CALL apoc.create.uuid() YIELD uuid
CALL apoc.periodic.commit("
  MERGE (H:tmpVars {id: {tmpId}})
  ON CREATE SET H.prices = [],
                H.lastPrice = {lastPrice}, 
                H.delta = {delta} 
  WITH H
  MATCH (P:Product) WHERE P.price > H.lastPrice AND 
                          P.price < H.lastPrice + H.delta
  WITH H, max(P.price) as lastPrice
  SET H.lastPrice = lastPrice, 
      H.prices = H.prices + lastPrice
  RETURN 1
  ", {tmpId: uuid, delta: 150, lastPrice: 50}
) YIELD updates, executions, runtime
MATCH (T:tmpVars {id: uuid}) 
WITH T, T.prices as prices DETACH DELETE T
WITH prices 
UNWIND prices as price
MATCH (P:Product) WHERE P.price = price
RETURN P ORDER BY P.price ASC

As an alternate solution which should be much faster to query, but requires more maintenance and care to keep working properly (especially with rapidly changing product price data), you can create relationships between your Product nodes in ascending price order, and keep the deltas as relationship properties. 作为一种替代解决方案,查询速度应该快得多,但需要更多维护和保养以保持正常工作(特别是在快速变化的产品价格数据时),您可以按升序价格顺序在产品节点之间创建关系,并将增量保持为关系属性。

Here's how you might create this using APOC Procedures: 以下是使用APOC程序创建此内容的方法:

MATCH (p:Product)
WITH p 
ORDER BY p.price ASC
WITH apoc.coll.pairsMin(COLLECT(p)) as products
UNWIND products as prodPairs
WITH prodPairs[0] as prod1, prodPairs[1] as prod2
CREATE (prod1)-[r:NextProd]->(prod2)
SET r.delta = prod2.price - prod1.price

And here's how you might query this once it's set up. 这里是你设置后如何查询它的方法。

WITH {startPrice:50, delta:150} as params
WITH params, params.startPrice + params.delta as ceiling
MATCH (start:Product)
WHERE params.startPrice <= start.price <= ceiling
WITH start, params
ORDER BY start.price ASC
LIMIT 1
MATCH (start)-[r:NextProd*0..]->(product:Product)
WHERE ALL(rel in r WHERE rel.delta <= params.delta)
RETURN DISTINCT product

This should be a fairly fast query, as the ALL() predicate should cut off the variable match when it reaches a relationship that exceeds the desired delta. 这应该是一个相当快的查询,因为ALL()谓词应该在达到超过所需delta的关系时切断变量匹配。

The downside, of course, is that you'll need to make sure every operation that will impact this linked list structure (adding or removing products and changing product prices) properly adjusts the structure, and you might need to consider locking approaches to ensure threadsafety so you don't mangle the linked list if products and/or prices update concurrently. 当然,缺点是您需要确保每个影响此链表结构的操作(添加或删除产品和更改产品价格)都能正确调整结构,您可能需要考虑锁定方法以确保线程安全因此,如果产品和/或价格同时更新,则不会破坏链表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM