[英]Neo4j cypher query improvement (performance)
我有以下密碼查詢:
CALL apoc.index.nodes('node_auto_index','pref_label:(Foo)')
YIELD node, weight
WHERE node.corpus = 'my_corpus'
WITH node, weight
MATCH (selected:ontoterm{corpus:'my_corpus'})-[:spotted_in]->(:WEBSITE)<-[:spotted_in]-(node:ontoterm{corpus:'my_corpus'})
WHERE selected.uri = 'http://uri1'
OR selected.uri = 'http://uri2'
OR selected.uri = 'http://uri3'
RETURN DISTINCT node, weight
ORDER BY weight DESC LIMIT 10
第一部分(直到WITH)運行非常快(Lucene舊索引),並返回約100個節點。 uri屬性也是唯一的(選擇= 3個節點),我有〜300個WEBSITE節點。 執行時間為48749毫秒。
如何重組查詢以提高性能? 為何配置文件中有〜13.8 Mio行?
我認為問題出在WITH子句中,它極大地擴展了結果。 InverseFalcon的答案使查詢速度更快:49-> 18秒(但仍然不夠快)。 為了避免巨大的擴展,我收集了網站。 以下查詢耗時60ms
MATCH (selected:ontoterm)-[:spotted_in]->(w:WEBSITE)
WHERE selected.uri in ['http://avgl.net/carbon_terms/Faser', 'http://avgl.net/carbon_terms/Carbon', 'http://avgl.net/carbon_terms/Leichtbau']
AND selected.corpus = 'carbon_terms'
with collect(distinct(w)) as websites
CALL apoc.index.nodes('node_auto_index','pref_label:(Fas OR Fas*)^10 OR pref_label_deco:(Fas OR Fas*)^3 OR alt_label:(Fa)^5') YIELD node, weight
WHERE node.corpus = 'carbon_terms' AND node:ontoterm
WITH websites, node, weight
match (node)-[:spotted_in]->(w:WEBSITE)
where w in websites
return node, weight
ORDER BY weight DESC
LIMIT 10
我在您的計划中沒有看到NodeUniqueIndexSeek的任何情況,因此無法有效地查找selected
節點。
確保對:ontoterm(uri)有唯一的約束。
設置唯一約束后,嘗試一下:
PROFILE CALL apoc.index.nodes('node_auto_index','pref_label:(Foo)')
YIELD node, weight
WHERE node.corpus = 'my_corpus' AND node:ontoterm
WITH node, weight
MATCH (selected:ontoterm)
WHERE selected.uri in ['http://uri1', 'http://uri2', 'http://uri3']
AND selected.corpus = 'my_corpus'
WITH node, weight, selected
MATCH (selected)-[:spotted_in]->(:WEBSITE)<-[:spotted_in]-(node)
RETURN DISTINCT node, weight
ORDER BY weight DESC LIMIT 10
看一下查詢計划。 您應該在那里看到某個地方的NodeUniqueIndexSeek,並希望您應該看到數據庫命中率下降。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.