简体   繁体   中英

Neo4J query performance

I have a relatively small graph (2.5M nodes, 5M rel, 7.7M properties) and I am executing (what seems to me) a simple query but it's taking 63 seconds to execute on a fast SSD-based laptop. Is this really the performance I should expect from Neo4j, or is there anything wrong with the query?

start ph=node(2)
match ph-[:NEXT_LEVEL]->c
where c.tag = "class 1"
with c
match c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
return s.tag as store, sum(l.item_quantity) as quantity order by s.tag;

控制台输出

Update: Just wanted to post the updated query:

start ph=node(2)
match ph-[:NEXT_LEVEL]->c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
where c.tag = "class 1"
with s.tag as store, sum(l.item_quantity) as quantity
return store, quantity order by store;

具有更新查询的控制台

Unless you have a specific use case, you should usually try removing the WITH clause to boost performance.

start ph=node(2)
match ph-[:NEXT_LEVEL]->c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
where c.tag = "class 1"
return s.tag as store, sum(l.item_quantity) as quantity order by s.tag;

Edit: As discussed in the comments, we can get even better performance by forcing the ORDER BY to happen after aggregation instead of before. We can do this by using WITH (so there's that specific use case we were just talking about). The difference here is that we've moved the WITH clause as far to the end as possible, allowing all previous processing to be grouped together rather than separated.

start ph=node(2)
match ph-[:NEXT_LEVEL]->c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
where c.tag = "class 1"
with s.tag as store, sum(l.item_quantity) as quantity
return store, quantity order by store;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM