简体   繁体   English

Neo4j - 复杂的密码查询 - 需要外连接

[英]Neo4j - Complex cypher query - needs outer join

I am using Neo4j graph database, and I am trying to build a complex cypher query 我正在使用Neo4j图形数据库,我正在尝试构建一个复杂的密码查询

I have the following nodes and relations: 我有以下节点和关系:

Nodes : 节点

  • Customer node 客户节点
  • Brand node 品牌节点
  • Segment node (which has a "MaxReservationsPerBrandPerMonth" property) 段节点(具有“MaxReservationsPerBrandPerMonth”属性)
  • Reservation node 预订节点

Relations : 关系

  • Each customer is segmented to a segment 每个客户都被细分为一个细分市场
  • customers can have a cust_reserved relation with a reservation 客户可以有一个预留的cust_reserved关系
  • Each reservation is related to a brand 每个预订都品牌有关

I want to get count of customers which are eligible for a new reservation for certain brand. 我想了解有资格获得某个品牌新预订的客户数量

so, I need to count reservations for each customer for the given brand and compare with MaxReservationsPerBrandPerMonth for this customer's segment. 因此,我需要计算给定品牌的每位客户的预订,并与此客户的细分市场的MaxReservationsPerBrandPerMonth进行比较。

note that customers who don't have reservations should be counted also 请注意,没有预订的客户也应计算在内

Thanks for your support. 谢谢你的支持。

EDIT 1 编辑1

after trying Michael first query, query is : 在尝试迈克尔第一次查询后,查询是:

PROFILE MATCH (c:Customer)
// also count customers without reservations (make this match optional)
OPTIONAL MATCH (c)-[:CUST_RESERVED]->(r:Reservation)-[:RESERVATION_FOR]->(b:brand {BRAND_ID: "3"})
// count reservations by brand and customers
WITH c, b, count(*) as reservations
MATCH (c)-[:SEGMENTED]->(s:segment)
WHERE reservations < s.max_sms_per_month
// aggregate count customers per brand
RETURN b.NAME, count(distinct c) as customers

Query result: 查询结果: 查询结果

the result is unexpected, I want to get eligible customer for "Honda" brand , the result is only 5 (which were already reserved for Honda before but still eligible because they didn't reach the maximum yet) , the result should be 998734 (all customers) 结果是出乎意料的,我想获得“本田”品牌的合格客户,结果只有5(之前已经为本田保留,但仍然符合条件,因为他们没有达到最大值),结果应该是998734(所有客户)

Query profile: Query profile 查询档案: 查询档案

Data set: here Needed query in SQL: here 数据集: 这里需要SQL中的查询: 这里

EDIT 2 编辑2

Michael second query worked like a charm in around 20 sec, thank you Michael I need to include some date logic to check reservations during certain reservations please :) 迈克尔第二次查询在大约20秒内就像一个魅力,谢谢迈克尔我需要包含一些日期逻辑来检查某些预订期间的预订请:)

You already spelled it out, please see our online guide for how you'd go from your use-case question: http://neo4j.com/developer/guide-data-modeling/ 您已将其拼写出来,请参阅我们的在线指南,了解您如何处理用例问题: http//neo4j.com/developer/guide-data-modeling/

So your patterns are 所以你的模式是

(:Customer)-[:IS_SEGMENTED]->(:Segment {MaxReservationsPerBrandPerMonth:int})
(:Customer)-[:HAS_RESERVED]->(:Reservation)
(:Reservation)-[:FOR_BRAND]->(:Brand)

Your question: 你的问题:

  1. I want to get count of customers which are eligible for a new reservation for certain brand. 我想了解有资格获得某个品牌新预订的客户数量。
  2. so, I need to count reservations for each customer for the given brand and compare with MaxReservationsPerBrandPerMonth for this customer's segment. 因此,我需要计算给定品牌的每位客户的预订,并与此客户的细分市场的MaxReservationsPerBrandPerMonth进行比较。
  3. note that customers who don't have reservations should be counted also 请注意,没有预订的客户也应计算在内
MATCH (c:Customer)
// also count customers without reservations (make this match optional)
OPTIONAL MATCH (c)-[:HAS_RESERVED]->(r:Reservation)-[:FOR_BRAND]->(b:Brand)
// count reservations by brand and customers
WITH c, b, count(*) as reservations
MATCH (c)-[:IS_SEGMENTED]->(s:Segment)
WHERE s.MaxReservationsPerBrandPerMonth < reservations
// aggregate count customers per brand
RETURN b.name, count(distinct c) as customers

"as what should customers be counted without reservations" as 0 ? “如果客户无需预订就应该算什么”为0?

Update: Query for non-optional reservations 更新:查询非可选预订

Question: which brand would customers without a reservation be related to / checkt against? 问题:没有预订的客户与哪些品牌相关/反对? As the reservations-count is 0 then MaxReservationsPerBrandPerMonth would have to be -1 for those to be ever selected? 由于reservations-count为0,那么MaxReservationsPerBrandPerMonth必须为-1才能被选中? So we can also just make it non-optional? 所以我们也可以让它不可选? Please also share the query plan you get with a PROFILE prefix. 还请与PROFILE前缀共享您获得的查询计划。

MATCH (c:Customer)-[:HAS_RESERVED]->(r:Reservation)-[:FOR_BRAND]->(b:Brand)
// count reservations by brand and customers
WITH c, b, count(*) as reservations
MATCH (c)-[:IS_SEGMENTED]->(s:Segment)
WHERE s.MaxReservationsPerBrandPerMonth < reservations
// aggregate count customers per brand
RETURN b.name, count(distinct c) as customers

Update 2, Query Optimization Attempt 更新2,查询优化尝试

Your query is a graph global one, so it will touch many many paths, you have at least 10M db-hits already. 您的查询是全局图形,因此它将触及许多路径,您已经拥有至少10M db-hits。

I'd probably change that query then a bit: 我可能会稍微更改一下这个查询:

I'm not 100% sure anymore though how brand comes into play here: Also a bit hard to do without a dataset! 虽然品牌如何在这里发挥作用,但我不是百分之百确定:没有数据集也有点困难!

MATCH (b:brand {BRAND_ID: "3"})
MATCH (c:Customer) 
WITH b,c, size((c)-[:CUST_RESERVED]->()) as total_res
WITH b,c,total_res, 
     case when total_res > 0 then last(nodes(head((c)-[:SEGMENTED]->(:segment)))).max_sms_per_month else 10 end as max_sms_per_month,
     case when total_res > 0 then size((c)-[:CUST_RESERVED]->()-[:RESERVATION_FOR]->(b)) else 0 end as brand_res
WHERE total_res = 0 OR total_res < max_sms_per_month OR brand_res < max_sms_per_month
// TODO what to do with reservations by brand?   
RETURN b.NAME, count(distinct c) as customers

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM