简体   繁体   English

当查询具有WHERE和WITH子句时的Neo4j Cypher执行计划

[英]Neo4j Cypher execution plan when query has WHERE and WITH clause

I have a Neo4j graph database that stores the Staffing Relations and Nodes. 我有一个Neo4j图形数据库,用于存储人员配置关系和节点。 I have to write a cypher that will find the home and office address of a resource (or employee) along with their empId and name. 我必须编写一个密码,以找到资源(或员工)的家庭和办公室地址以及他们的empId和名称。 This is needed so that Staffing Solution can staff resources according to their home location as well as near to their office. 这是必需的,以便“人员配备解决方案”可以根据他们的家庭位置以及附近的办公室来配备人员。

MATCH (employee:Employee) <-[:ADDRESS_TO_EMPLOYEE]- (homeAddress:HomeAddress) 
WHERE employee.id = '70' 
WITH  employee, homeAddress 
MATCH (employee)-[:EMPLOYEE_TO_OFFICEADDRESS]->(officeAddress:OfficeAddress) 
RETURN employee.empId, employee.name,  
homeAddress.street, homeAddress.area, homeAddress.city,  
officeAddress.street, officeAddress.area, officeAddress.city

This cypher returns the desired results. 该密码返回期望的结果。

However, if I move the WHERE condition in the last, just before the RETURN clause. 但是,如果我将WHERE条件移到最后一个位置,就在RETURN子句之前。

MATCH (employee:Employee) <-[:ADDRESS_TO_EMPLOYEE]- (homeAddress:HomeAddress) 
WITH  employee, homeAddress  
MATCH (employee)-[:EMPLOYEE_TO_OFFICEADDRESS]->(officeAddress:OfficeAddress) 
WHERE employee.id = '70' 
RETURN employee.empId, employee.name,  
homeAddress.street, homeAddress.area, homeAddress.city,  
officeAddress.street, officeAddress.area, officeAddress.city 

It again gives me the same result. 它再次给了我相同的结果。

So which one is more optimized as the query execution plan is same in both the cases?. 那么在这两种情况下,哪个查询查询计划是最优化的呢? I mean same number of DB hits and returned Records. 我的意思是数据库命中次数和返回的记录数相同。

Now, if I remove the WITH clause, 现在,如果我删除WITH子句,

MATCH (employee:Employee) <-[:ADDRESS_TO_EMPLOYEE]- 
(homeAddress:HomeAddress),
MATCH (employee)-[:EMPLOYEE_TO_OFFICEADDRESS]->(officeAddress:OfficeAddress) 
WHERE employee.id = '70' 
RETURN employee.empId, employee.name, 
homeAddress.street, homeAddress.area, homeAddress.city, 
officeAddress.street, officeAddress.area, officeAddress.city

Then again the results is same, execution plan is also same. 然后结果又是一样的,执行计划也一样。

Do I really need WITH in this case? 在这种情况下,我真的需要WITH吗?

Any help would be greatly appreciated. 任何帮助将不胜感激。

First, you can use Profile and Explain to get the performance of your query. 首先,您可以使用Profile and Explain来获得查询的性能。 Though, as long as you get the results you want in the time you want, the cypher doesn't matter too much, as the behavior will change depending on the Cypher Planner (version) running in the db. 但是,只要您在想要的时间内获得想要的结果,密码就不会太在意,因为行为会根据数据库中运行的密码计划器(版本)而改变。 So as long as the cypher passes unit and load tests, the rest doesn't matter (assuming reasonably accurate tests). 因此,只要密码通过了单元测试和负载测试,其余的都没关系(假设测试相当准确)。

Second, In general, less is more. 其次,总的来说,少即是多。 Imagine you had to read your own cypher, and look up the info yourself on paper printouts. 想象一下,您必须阅读自己的密码,然后在纸质打印件上自行查找信息。 Isn't MATCH (officeAddress:OfficeAddress)<-[:EMPLOYEE_TO_OFFICEADDRESS]-(employee:Employee {id:'70'})<-[:ADDRESS_TO_EMPLOYEE]-(homeAddress:HomeAddress) so much easier to tell what exactly you are looking for? 不是MATCH (officeAddress:OfficeAddress)<-[:EMPLOYEE_TO_OFFICEADDRESS]-(employee:Employee {id:'70'})<-[:ADDRESS_TO_EMPLOYEE]-(homeAddress:HomeAddress)这么容易分辨出您要查找的内容对于? The easier it is for the Cypher planner to read what you want, the more likely the Cypher planner will plan the most efficient lookup strategy. Cypher计划者越容易阅读您想要的内容,Cypher计划者就越有可能计划最有效的查找策略。 Also, keeping your WHERE clause close to the relevant match also helps the planner. 另外,将WHERE子句保持在相关匹配范围附近也有助于计划者。 So try to keep your cyphers as simple as possible, while still being accurate for what you want. 因此,尝试使密码尽可能简单,同时仍能准确满足您的需求。

In your Cypher, the only part that really matters is the WITH. 在您的Cypher中,唯一重要的部分是WITH。 WITH creates a logical break in the cypher, and a scope change for variables, As you aren't doing anything with the with, it's better to drop it. WITH将在密码中产生逻辑中断,并更改变量的范围。由于您对with并没有做任何事情,因此最好将其删除。 The only side effect it can produce in this case, is tricking the Cypher to do more work than necessary for the first match, to filter it down later. 在这种情况下,它可能产生的唯一副作用是诱使Cypher进行比第一次比赛所需的工作更多的工作,以便稍后对其进行过滤。 If an Employee is expected to have more than 1 home address, than WITH employee, COLLECT(homeAddress) as homeAdress will reduce that match to 1 row per employee, making the next match cheaper, but since I'm sure both sides of the match should only yield 1 result, it doesn't matter what the planner does first. 如果一个雇员的家庭住址要比WITH employee, COLLECT(homeAddress) as homeAdress会将每名雇员的匹配减少到1行,使下一次匹配更便宜,但是由于我确信匹配的双方应该只产生1个结果,计划者先做什么都没关系。 (In general, you use with to aggregate results down to less rows, to make the rest of the cypher cheaper. Which shouldn't apply in this context) (通常,您可以使用with将结果汇总到更少的行,以使其余密码更便宜。在这种情况下,该方法不适用)

  1. You should always put a WHERE clause as early as possible in a query. 您应该始终在查询中尽早放置WHERE子句。 That will filter out data that the rest of the query will not have to deal with, avoiding possible unneeded work. 这样可以过滤掉其余查询将不需要处理的数据,从而避免了不必要的工作。

  2. You should avoid writing a WITH clause that is just passing forward all the defined variables (and is not required syntactically), since it is essentially a no-op. 您应该避免编写仅传递所有定义的变量(语法上不需要)的WITH子句,因为它本质上是无操作的。 It wastes (a little bit of) time for the planner to process, and makes the Cypher code a bit harder to understand. 计划者的处理工作浪费(一点点)时间,并使Cypher代码更难以理解。

This simpler version of your query should produce the same query plan: 这个简单的查询版本应产生相同的查询计划:

MATCH (officeAddress:OfficeAddress)<-[:EMPLOYEE_TO_OFFICEADDRESS]-(employee:Employee)<-[:ADDRESS_TO_EMPLOYEE]-(homeAddress:HomeAddress) 
WHERE employee.id = '70' 
RETURN
  employee.empId, employee.name,  
  homeAddress.street, homeAddress.area, homeAddress.city,  
  officeAddress.street, officeAddress.area, officeAddress.city

And the following version (using the map projection syntax) is even simpler (with a similar query plan). 以下版本(使用地图投影语法)甚至更简单(具有类似的查询计划)。

MATCH (officeAddress:OfficeAddress)<-[:EMPLOYEE_TO_OFFICEADDRESS]-(employee:Employee)<-[:ADDRESS_TO_EMPLOYEE]-(homeAddress:HomeAddress) 
WHERE employee.id = '70' 
RETURN
  employee{.empId, .name},  
  homeAddress{.street, .area, .city},  
  officeAddress{.street, .area, .city}

The results of the above query have a different structure, though: 但是,上述查询的结果具有不同的结构:

╒═══════════════════════════╤══════════════════════════════════════╤══════════════════════════════════════╕
│"employee"                 │"homeAddress"                         │"officeAddress"                       │
╞═══════════════════════════╪══════════════════════════════════════╪══════════════════════════════════════╡
│{"name":"sam","empId":"70"}│{"area":1,"city":"foo","street":"123"}│{"area":2,"city":"bar","street":"345"}│
└───────────────────────────┴──────────────────────────────────────┴──────────────────────────────────────┘

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM