简体   繁体   English

Neo4j按范围性能匹配

[英]Neo4j match by range performance

I've got following setup: 我有以下设置:

About 1,5m nodes of type IpRangeBlock consisting of start and end properties - both of them are of type Long. 型的约1.5米的节点IpRangeBlockstartend的属性-两者都是Long类型。 There's an index on the start property. start属性上有一个索引。

What I then do is to find a range containing given IP. 然后,我要做的是找到一个包含给定IP的范围。 So, eg for ip 0.0.0.2 I convert it to long and then perform comparison on all nodes n.start <= 2 && n.end >= 2 . 因此,例如对于ip 0.0.0.2我将其转换为long,然后在所有节点上执行比较n.start <= 2 && n.end >= 2

The cypher query I run looks like this: 我运行的密码查询如下所示:

MATCH (n:IpRangeBlock) WHERE n.start <= {ip} AND n.end >= {ip} RETURN n LIMIT 1

All is fine, though as I mentioned, for 1,5m nodes I have it can take up to 20s for Neo4j to find matching range. 一切都很好,尽管正如我提到的那样,对于150万个节点,Neo4j最多可能需要20秒钟才能找到匹配范围。 My question is, is there a way to speed up this operation or is the fault in my db design? 我的问题是,是否有一种方法可以加快此操作,还是我的数据库设计有问题?

Ok, I tried caching node references and performing the comparison on the app side. 好的,我尝试缓存节点引用并在应用程序端执行比较。 As you might expect - pulling that much of nodes takes time. 如您所料-拉动大量节点需要花费时间。

So I tried another approach - I examined our data set and it turned out that all ip ranges' start and end properties begin with the same first octet. 因此,我尝试了另一种方法-检查了我们的数据集,结果发现所有ip范围的startend属性均以相同的第一个八位位组开头。 I used those octets as grouping nodes to quickly narrow down subset of probable IP ranges. 我使用这些八位位组作为分组节点,以快速缩小可能的IP范围的子集。 This worked well, as our dataset is actually well distributed across all ip ranges. 由于我们的数据集实际上在所有ip范围内均分布良好,因此效果很好。 now, instead of comparing 100k nodes' properties, each query has to do it 'only' for around 8-10k. 现在,不必比较10万个节点的属性,每个查询只能“执行”大约8至10k的操作。

I know it's not perfect aproach but it worked for me. 我知道这不是完美的方法,但对我有用。 There's neo4j article I got this idea from. 我从neo4j的文章中得到了这个想法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM