简体   繁体   English

避免某些遍历

[英]Avoiding certain traversal

I have a database of about 10M nodes. 我有一个大约10M节点的数据库。 9.9M of those nodes are details that are not used in 99% of the queries but still required for 1% of the queries. 这些节点中有9.9M是99%的查询中未使用的细节,但仍然需要1%的查询。 For the 99% of queries out there, how do I tell the graph database to not go traversing down a specific node no matter what in queries where the path may be a whildcard? 对于99%的查询,我如何告诉图形数据库不要遍历特定节点,无论查询路径可能是什么卡?

Apologies for adding tags to three graph databases, I'm still evaluating which graph db is the right one for us to use. 向三个图形数据库添加标签的道歉,我还在评估哪个图形数据库是我们使用的正确数据库。

When working with Neo4j you can use labels to group nodes into sets. 使用Neo4j时,您可以使用标签将节点分组。 Examples of labels are :User , Product , Admin , etc. Also, relationships between nodes can be typed. 标签的示例是:UserProductAdmin等。此外,可以键入节点之间的关系

These constructs can be used at query time to tell the database which nodes labels / relationship types should be used. 可以在查询时使用这些构造来告诉数据库应该使用哪些节点标签/关系类型。

Examples: 例子:

1 - Return only nodes with specific node label ( :User ): 1 - 仅返回具有特定节点标签的节点( :User ):

MATCH (u:User)
RETURN u

2 - Return nodes with :User label and not Administrator label (since nodes can have more than one label): 2 - 返回节点:User标签而非Administrator标签(因为节点可以有多个标签):

MATCH (u:User)
WHERE NOT u:Administrator
RETURN u

3 - Match the pattern between an :User and a :Product following only relationships with type :BUY from an user to a product where u.id = 10 and not considering users that are :Administrator s to. 3 - 匹配以下模式:User和a :Product仅遵循与类型的关系:BUY从用户:BUY到产品,其中u.id = 10而不考虑以下用户:Administrator到。 Return the user and the related product. 返回用户和相关产品。

MATCH (u:User)-[:BUY]->(p:Product)
WHERE u.id = 10 AND NOT u:Administrator
RETURN u, p

With OrientDB you can use class hierarchies, inheritance and polymorphic queries, eg. 使用OrientDB,您可以使用类层次结构,继承和多态查询,例如。 you can have two classes, say "Class1" (relevant) and "Class2" (details) that both share a superclass, say "SuperClass". 你可以有两个类,比如“Class1”(相关)和“Class2”(细节),它们都共享一个超类,比如“SuperClass”。

Then you can execute queries on the subclasses, if you only need relevant records: 然后,如果只需要相关记录,则可以对子类执行查询:

 MATCH
    {class:Class1, as:p1} -TheEdgeClass-> {class:Class1, as:p2, while:($depth < 10)}
 RETURN $elements

or on the superclass, if you need both relevant and details: 或者在超类上,如果您需要相关和详细信息:

 MATCH
    {class:SuperClass, as:p1} -TheEdgeClass-> {class:SuperClass, as:p2, while:($depth < 10)}
 RETURN $elements

The second query is polymorphic, that means that it returns both records of "Class1" and "Class2", because they both extend "Superclass". 第二个查询是多态的,这意味着它返回“Class1”和“Class2”的记录,因为它们都扩展了“Superclass”。

The exact same applies to edge classes, so you can have class hierarchies for edges and use polymorphism also to choose which relationships you need to traverse. 完全相同的情况适用于边类,因此您可以使用边的类层次结构,并使用多态也可以选择需要遍历的关系。

Of course there are other strategies, eg. 当然还有其他策略,例如。 you can add a WHERE condition to the patterns and filter based on attributes, but you will lose the advantage of data locality (records of different classes are stored in different files, so when you query based on a single class you have more chances to have hot cache and less disk access). 您可以根据属性为模式和过滤器添加WHERE条件,但是您将失去数据局部性的优势(不同类的记录存储在不同的文件中,因此当您基于单个类进行查询时,您有更多机会拥有热缓存和较少的磁盘访问)。

Also consider that class hierarchies can be multiple levels deep (an actual tree of classes) 还要考虑类层次结构可以是多层深度(实际的类树)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM