简体   繁体   中英

Neo4j Cypher query performance optimization

I have the following Neo4j Cypher query

MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision) 
WHERE dg.id = 1 
MATCH (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(filterCharacteristic4:Characteristic) 
WHERE filterCharacteristic4.id = 4 
WITH relationshipValueRel4, childD, dg 
WHERE  (ANY (id IN [2,3] 
WHERE id IN relationshipValueRel4.optionIds ))  
WITH childD, dg  
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion) 
WHERE c.id IN [2, 3] 
WITH childD, dg, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes 
WITH childD , dg , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes  
ORDER BY  weight DESC 
SKIP 0 LIMIT 10 
WITH * MATCH (childD)-[ru:CREATED_BY]->(u:User) OPTIONAL MATCH (childD)-[rup:UPDATED_BY]->(up:User)  
RETURN ru, u, rup, up, childD AS decision, weight, totalVotes, 
[ (dg)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) | {entityId: toInt(entity.id),  types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups, 
[ (dg)<-[:DEFINED_BY]-(c1)<-[vg1:HAS_VOTE_ON]-(childD) | {criterionId: toInt(c1.id),  weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria, [ (dg)<-[:DEFINED_BY]-(ch1:Characteristic)<-[v1:HAS_VALUE_ON]-(childD)  WHERE  NOT ((ch1)<-[:DEPENDS_ON]-())  | {characteristicId: toInt(ch1.id),  optionIds: v1.optionIds, valueIds: v1.valueIds, value: v1.value, available: v1.available, totalHistoryValues: v1.totalHistoryValues, totalFlags: v1.totalFlags, description: v1.description, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics

I'm not sutisfied with the performance of this query execution.

This is PROFILE output:

Cypher version: CYPHER 3.3, planner: COST, runtime: INTERPRETED. 3296130 total db hits in 2936 ms

在此处输入图片说明

Is there any chance to optimize this query performance ?

It will be a little hard to optimize this query without a dataset, knowledge of you graph and what you are searching to do.

Performances depend on :

  1. Query itself
  2. Schema (index & constrainsts)
  3. Graph modeling
  4. Neo4j configuration
  5. Hardware

There is no big problem on your query, even if it can be written into a more readable state for me (ex: one big match , sugar syntax on where clause in the match , replace the any by an or , ...) , but it will not change the query plan.

Be sure to use query parameters with this query to avoid to recalculate the query plan of this long query everytimes.

Your query pass most of its times into (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(:Characteristic) + the where clause on it (ie. 1.5M * 2 dbhits). So a solution can be to change the model by creating some relationships like that : HAS_VALUE_ON_WITH_OPTID_1 , HAS_VALUE_ON_WITH_OPTID_2 ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM