Neo4J query performance

Question

I have a relatively small graph (2.5M nodes, 5M rel, 7.7M properties) and I am executing (what seems to me) a simple query but it's taking 63 seconds to execute on a fast SSD-based laptop. Is this really the performance I should expect from Neo4j, or is there anything wrong with the query?

start ph=node(2)
match ph-[:NEXT_LEVEL]->c
where c.tag = "class 1"
with c
match c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
return s.tag as store, sum(l.item_quantity) as quantity order by s.tag;

控制台输出

Update: Just wanted to post the updated query:

start ph=node(2)
match ph-[:NEXT_LEVEL]->c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
where c.tag = "class 1"
with s.tag as store, sum(l.item_quantity) as quantity
return store, quantity order by store;

具有更新查询的控制台

Answer 1

Unless you have a specific use case, you should usually try removing the WITH clause to boost performance.

start ph=node(2)
match ph-[:NEXT_LEVEL]->c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
where c.tag = "class 1"
return s.tag as store, sum(l.item_quantity) as quantity order by s.tag;

Edit: As discussed in the comments, we can get even better performance by forcing the ORDER BY to happen after aggregation instead of before. We can do this by using WITH (so there's that specific use case we were just talking about). The difference here is that we've moved the WITH clause as far to the end as possible, allowing all previous processing to be grouped together rather than separated.

start ph=node(2)
match ph-[:NEXT_LEVEL]->c-[:NEXT_LEVEL]->p<-[:SOLD]-l<-[:LINE]-h-[:SOLD_IN]->s
where c.tag = "class 1"
with s.tag as store, sum(l.item_quantity) as quantity
return store, quantity order by store;

Neo4J query performance

Question

1 answers

solution1
0 ACCPTED 2013-06-10 15:14:53

Neo4J query performance

Question

1 answers

solution1 0 ACCPTED 2013-06-10 15:14:53

solution1
0 ACCPTED 2013-06-10 15:14:53