[英]Neo4j - Complex cypher query - needs outer join
I am using Neo4j graph database, and I am trying to build a complex cypher query 我正在使用Neo4j图形数据库,我正在尝试构建一个复杂的密码查询
I have the following nodes and relations: 我有以下节点和关系:
Nodes : 节点 :
Relations : 关系 :
I want to get count of customers which are eligible for a new reservation for certain brand. 我想了解有资格获得某个品牌新预订的客户数量 。
so, I need to count reservations for each customer for the given brand and compare with MaxReservationsPerBrandPerMonth for this customer's segment. 因此,我需要计算给定品牌的每位客户的预订,并与此客户的细分市场的MaxReservationsPerBrandPerMonth进行比较。
note that customers who don't have reservations should be counted also 请注意,没有预订的客户也应计算在内
Thanks for your support. 谢谢你的支持。
EDIT 1 编辑1
after trying Michael first query, query is : 在尝试迈克尔第一次查询后,查询是:
PROFILE MATCH (c:Customer)
// also count customers without reservations (make this match optional)
OPTIONAL MATCH (c)-[:CUST_RESERVED]->(r:Reservation)-[:RESERVATION_FOR]->(b:brand {BRAND_ID: "3"})
// count reservations by brand and customers
WITH c, b, count(*) as reservations
MATCH (c)-[:SEGMENTED]->(s:segment)
WHERE reservations < s.max_sms_per_month
// aggregate count customers per brand
RETURN b.NAME, count(distinct c) as customers
Query result: 查询结果:
the result is unexpected, I want to get eligible customer for "Honda" brand , the result is only 5 (which were already reserved for Honda before but still eligible because they didn't reach the maximum yet) , the result should be 998734 (all customers) 结果是出乎意料的,我想获得“本田”品牌的合格客户,结果只有5(之前已经为本田保留,但仍然符合条件,因为他们没有达到最大值),结果应该是998734(所有客户)
Query profile: Query profile 查询档案: 查询档案
Data set: here Needed query in SQL: here 数据集: 这里需要SQL中的查询: 这里
EDIT 2 编辑2
Michael second query worked like a charm in around 20 sec, thank you Michael I need to include some date logic to check reservations during certain reservations please :) 迈克尔第二次查询在大约20秒内就像一个魅力,谢谢迈克尔我需要包含一些日期逻辑来检查某些预订期间的预订请:)
You already spelled it out, please see our online guide for how you'd go from your use-case question: http://neo4j.com/developer/guide-data-modeling/ 您已将其拼写出来,请参阅我们的在线指南,了解您如何处理用例问题: http : //neo4j.com/developer/guide-data-modeling/
So your patterns are 所以你的模式是
(:Customer)-[:IS_SEGMENTED]->(:Segment {MaxReservationsPerBrandPerMonth:int})
(:Customer)-[:HAS_RESERVED]->(:Reservation)
(:Reservation)-[:FOR_BRAND]->(:Brand)
Your question: 你的问题:
- I want to get count of customers which are eligible for a new reservation for certain brand.
我想了解有资格获得某个品牌新预订的客户数量。
- so, I need to count reservations for each customer for the given brand and compare with MaxReservationsPerBrandPerMonth for this customer's segment.
因此,我需要计算给定品牌的每位客户的预订,并与此客户的细分市场的MaxReservationsPerBrandPerMonth进行比较。
- note that customers who don't have reservations should be counted also
请注意,没有预订的客户也应计算在内
MATCH (c:Customer)
// also count customers without reservations (make this match optional)
OPTIONAL MATCH (c)-[:HAS_RESERVED]->(r:Reservation)-[:FOR_BRAND]->(b:Brand)
// count reservations by brand and customers
WITH c, b, count(*) as reservations
MATCH (c)-[:IS_SEGMENTED]->(s:Segment)
WHERE s.MaxReservationsPerBrandPerMonth < reservations
// aggregate count customers per brand
RETURN b.name, count(distinct c) as customers
"as what should customers be counted without reservations" as 0 ? “如果客户无需预订就应该算什么”为0?
Question: which brand would customers without a reservation be related to / checkt against? 问题:没有预订的客户与哪些品牌相关/反对? As the reservations-count is 0 then MaxReservationsPerBrandPerMonth would have to be -1 for those to be ever selected?
由于reservations-count为0,那么MaxReservationsPerBrandPerMonth必须为-1才能被选中? So we can also just make it non-optional?
所以我们也可以让它不可选? Please also share the query plan you get with a PROFILE prefix.
还请与PROFILE前缀共享您获得的查询计划。
MATCH (c:Customer)-[:HAS_RESERVED]->(r:Reservation)-[:FOR_BRAND]->(b:Brand)
// count reservations by brand and customers
WITH c, b, count(*) as reservations
MATCH (c)-[:IS_SEGMENTED]->(s:Segment)
WHERE s.MaxReservationsPerBrandPerMonth < reservations
// aggregate count customers per brand
RETURN b.name, count(distinct c) as customers
Your query is a graph global one, so it will touch many many paths, you have at least 10M db-hits already. 您的查询是全局图形,因此它将触及许多路径,您已经拥有至少10M db-hits。
I'd probably change that query then a bit: 我可能会稍微更改一下这个查询:
I'm not 100% sure anymore though how brand comes into play here: Also a bit hard to do without a dataset! 虽然品牌如何在这里发挥作用,但我不是百分之百确定:没有数据集也有点困难!
MATCH (b:brand {BRAND_ID: "3"})
MATCH (c:Customer)
WITH b,c, size((c)-[:CUST_RESERVED]->()) as total_res
WITH b,c,total_res,
case when total_res > 0 then last(nodes(head((c)-[:SEGMENTED]->(:segment)))).max_sms_per_month else 10 end as max_sms_per_month,
case when total_res > 0 then size((c)-[:CUST_RESERVED]->()-[:RESERVATION_FOR]->(b)) else 0 end as brand_res
WHERE total_res = 0 OR total_res < max_sms_per_month OR brand_res < max_sms_per_month
// TODO what to do with reservations by brand?
RETURN b.NAME, count(distinct c) as customers
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.