简体   繁体   English

neo4j-改善Cypher查询

[英]neo4j - Improving a Cypher query

I have a performance critical application which has to match multiple nodes to another node based on regex matching. 我有一个性能至关重要的应用程序,该应用程序必须根据正则表达式匹配将多个节点与另一个节点进行匹配。 My current query is as follows: 我当前的查询如下:

MATCH (person: Person {name: 'Mark'})
WITH person
UNWIND person.match_list AS match
MATCH (pet: Animal) 
WHERE pet.name_regex =~ match
MERGE (person)-[:OWNS_PET]->(pet) 
RETURN pet

However, this query runs VERY slow (around 500ms on my workstation). 但是,此查询运行速度非常慢(在我的工作站上大约500毫秒)。 The graph contains around 500K nodes, and around 10K will match the regex. 该图包含大约50万个节点,大约1万个将与正则表达式匹配。

I'm wondering whether there is a more efficient way to re-write this query to work the same but provide a performance increase. 我想知道是否有一种更有效的方法来重新编写此查询以使其工作相同,但可以提高性能。

EDIT: 编辑:

When I run this query on several Persons multithreaded I get a TransientError exception 在多人多线程上运行此查询时,出现TransientError异常

neo4j.exceptions.TransientError: ForsetiClient[3] can't acquire ExclusiveLock{owner=ForsetiClient[14]} on NODE(1889), because holders of that lock are waiting for ForsetiClient[3].

EDIT 2: 编辑2:

Person:name is unique and indexed Person:name是唯一的并已建立索引

Animal:name_regex is not indexed Animal:name_regex未编制索引

First, I would start by simplifying your query as much as possible. 首先,我将首先尽可能简化您的查询。 The way you are doing it now creates a lot of wasted effort after a match has been found 找到匹配后,您的操作方式现在会浪费很多精力

MATCH (person: Person {name: 'Mark'}), (pet: Animal)
WHERE ANY(match in person.match_list WHERE pet.name_regex =~ match)
MERGE (person)-[:OWNS_PET]->(pet) 
RETURN pet

This will make it so that only 1 merge is attempted if there are multiple matches, and once one match is found, the rest won't be attempted on the same pet. 这样一来,如果有多个匹配项,则仅尝试进行1次合并,一旦找到一个匹配项,将不再尝试对同一只宠物进行其余的合并。 This also allows Cypher to optimize to the best of it's ability on your data. 这也使Cypher可以根据您的数据进行最大程度的优化。

To improve the cypher further, you will need to optimize your data. 为了进一步提高密码,您将需要优化数据。 For example, regex match is expensive (requires a node+string scan), if the match statements can be largely reused between people, it would be better to break them out into a node, and then connect to those so that the work of one regex match can be reused everywhere it's repeated. 例如,正则表达式匹配非常昂贵(需要进行节点+字符串扫描),如果可以在人与人之间大量重复使用match语句,则最好将它们分解成一个节点,然后连接到那些节点上,以便一个人的工作正则表达式匹配可以在任何重复的地方重复使用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM