简体   繁体   English

密码查询中的大量关系类型

[英]Large number of relationship types in cypher query

I'm prototyping a user-authorization/data-protection scheme in Neo4j, and I ran into a strange issue with one of my queries. 我正在Neo4j中对用户授权/数据保护方案进行原型设计,我遇到了一个奇怪的问题。 For background, the concept is that a user trying to get from a to be can get to be if they have the correct access identifier. 对于背景,概念是如果用户具有正确的访问标识符,则尝试从a到达的用户可以成为。 So, our edges are of types that have access identifiers in them. 因此,我们的边缘是具有访问标识符的类型。 I'm testing this scheme by creating lots of nodes, and connecting pairs of them with different accesses. 我正在通过创建大量节点来测试这个方案,并通过不同的访问来连接它们。 That is, I have lots of sets of: 也就是说,我有很多套:

(a)-[:ACCESS_A]->(b)

With different accesses. 随着不同的访问。 I query for them with: 我用以下方法查询它们:

{some query} with a match (a)-[:ACCESS_A|:ACCESS_B|<...>|:ACCESS_Z]->(b) return b

where the size of the list in the edge match grows with the number of accesses the user has. 边缘匹配中列表的大小随着用户访问次数的增加而增加。

This all works great, until the list gets to 201 accesses. 这一切都很有效,直到列表获得201次访问。 At this point, the profile shows the db hits and time taken go WAY up. 此时,配置文件显示db命中和时间。 At 200 relationship types, the profile shows 1051 db hits, but 201 relationship types shows 31801. That's a 30-fold increase for one more type! 在200种关系类型中,配置文件显示1051 db命中,但201种关系类型显示31801.这是另外一种类型的30倍增加! Time taken increases in a similar manner. 所用时间以类似方式增加。 going from 199 to 200 only goes up by about 50 hits, and that's due to an increasing number of nodes hit. 从199到200只上升了大约50次点击,这是由于节点数量增加所致。

After more work, it looks like the round 200 number is more a red herring than the issue. 经过更多的工作,看起来圆形的200号码比问题更像红色鲱鱼。 Previously, my relationship types were 4 characters. 以前,我的关系类型是4个字符。 When I changed them to 9 characters (prepending "EDGE_", as a test), the issue began occurring at 50 types - 50 has 36 accesses, while 51 has 291 - a smaller jump, but significant when compared to previous increases in the same test. 当我将它们改为9个字符时(在“EDGE_”之前,作为测试),问题开始发生在50种类型--50有36次访问,而51有291 - 较小的跳跃,但与之前相同的增加相比显着测试。

It appears that there is some relation of relationship name to where the query falls off, but I'm still investigating. 似乎关系名称与查询失败的位置存在某种关系,但我仍在调查中。

Things that I've tested and not found to be of interest: 我测试过但没有发现感兴趣的东西:

  • length of the overall query (string size): It fails at entirely different query sizes with 4 and 9 character relationship types 整个查询的长度(字符串大小):它以完全不同的查询大小失败,具有4和9个字符关系类型
  • length of the list in the [e:<...>] clause (string size). [e:<...>]子句中列表的长度(字符串大小)。 As above, it fails at very different sizes 如上所述,它的尺寸非常不同
  • number of nodes or edges in the graph 图中的节点数或边数

To my knowledge you shouldn't be running into performance issues with only 200 relationship types. 据我所知,您不应该只遇到200种关系类型的性能问题。

Prior to version 3.0, the number of relationship types was capped at 64k. 在3.0版之前,关系类型的数量上限为64k。 That limit was removed with version 3.0. 版本3.0删除了该限制。

I was able to discover the solution to my issue. 我能够发现我的问题的解决方案。 It appears that asking Neo4j for many more different relationship types than exist causes the issue. 似乎要求Neo4j提供比存在更多不同的关系类型会导致问题。 I was able to use many more than 200 when those types all existed. 当这些类型都存在时,我能够使用超过200个。 Therefore, the solution is to ensure you do not ask for any types not in the graph. 因此,解决方案是确保您不要求图表中没有任何类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM