简体   繁体   English

Gremlin 查询以查找特定节点以任何方式连接到的整个子图

[英]Gremlin query to find the entire sub-graph that a specific node is connected in any way to

I am brand new to Gremlin and am using gremlin-python to traverse my graph.我是 Gremlin 的新手,正在使用gremlin-python遍历我的图表。 The graph is made up of many clusters or sub-graphs which are intra-connected, and not inter-connected with any other cluster in the graph.该图由许多集群或子图组成,这些集群或子图是内部连接的,并且不与图中的任何其他集群互连。

A simple example of this is a graph with 5 nodes and 3 edges:一个简单的例子是一个有 5 个节点和 3 个边的图:

  • Customer_1 is connected to CreditCard_A with 1_HasCreditCard_A edge Customer_1通过1_HasCreditCard_A边连接到CreditCard_A
  • Customer_2 is connected to CreditCard_B with 2_HasCreditCard_B edge Customer_2通过2_HasCreditCard_B边缘连接到CreditCard_B
  • Customer_3 is connected to CreditCard_A with 3_HasCreditCard_A edge Customer_3通过3_HasCreditCard_A边连接到CreditCard_A

I want a query that will return a sub-graph object of all nodes and edges connected (in or out) to the queried node.我想要一个查询,它将返回一个子图 object 连接(输入或输出)到查询节点的所有节点和边。 I can then store this sub-graph as a variable and then run different traversals on it to calculate different things.然后我可以将这个子图存储为一个变量,然后在它上面运行不同的遍历来计算不同的东西。

This query would need to be recursive as these clusters could be made up of nodes which are many (inward or outward) hops away from each other.此查询需要递归,因为这些集群可能由彼此相距许多(向内或向外)跃点的节点组成。 There are also many different types of nodes and edges, and they all must be returned.还有许多不同类型的节点和边,它们都必须返回。

For example:例如:

  • If I specified Customer_1 in the query, the resulting sub-graph would contain Customer_1 , Customer_3 , CreditCardA , 1_HasCreditCard_A , and 3_HasCreditCard_A .如果我在查询中指定Customer_1 ,则生成的子图将包含Customer_1Customer_3CreditCardA1_HasCreditCard_A3_HasCreditCard_A
  • If I specififed Customer_2 , the returned sub-graph would consist of Customer_2 , CreditCard_B , 2_HasCreditCard_B .如果我指定Customer_2 ,则返回的子图将包含Customer_2CreditCard_B2_HasCreditCard_B
  • If I queried Customer_3 , the exact same subgraph object as returned from the Customer_1 query would be returned.如果我查询Customer_3 ,将返回与Customer_1查询返回的完全相同的子图 object。

I have used both Neo4J with Cypher and Dgraph with GraphQL and found this task quite easy in these two langauges, but am struggling a bit more with understanding gremlin.我已经使用了 Neo4J 和 Cypher 和 Dgraph 和 GraphQL 并发现在这两种语言中这项任务很容易,但是在理解 grem 方面我有点挣扎。

EDIT:编辑:

From, this question , the selected answer should achieve what I want, but without specifying the edge type by changing .both('created') to just .both() .这个问题,选择的答案应该达到我想要的,但没有通过将.both('created')更改为 .both .both() ) 来指定边缘类型。

However, the loop syntax: .loop{true}{true} is invalid in Python of course.但是,循环语法: .loop{true}{true}当然在 Python 中是无效的。 Is this loop function available in gremlin-python ?这个循环 function 在gremlin-python中可用吗? I cannot find anything.我找不到任何东西。

EDIT 2:编辑2:

I have tried this and it seems to be working as expected, I think.我已经尝试过了,我认为它似乎按预期工作。

g.V(node_id).repeat(bothE().otherV().simplePath()).emit()

Is this a valid solution to what I am looking for?这是我正在寻找的有效解决方案吗? Is it also possible to include the queried node in this result?是否也可以在此结果中包含查询的节点?

Regarding the second edit, this looks like a valid solution that returns all the vertices connected to the starting vertex.关于第二次编辑,这看起来像是一个有效的解决方案,它返回连接到起始顶点的所有顶点。 Some small fixes:一些小修复:

  • you can change the bothE().otherV() to both()您可以将bothE().otherV()更改为both()
  • if you want to get also the starting vertex you need to move the emit step before the repeat如果您还想获得起始顶点,则需要在repeat之前移动emit步骤
  • I would add a dedup step to remove all duplicate vertices (can be more than 1 path to a vertex)我会添加一个dedup步骤来删除所有重复的顶点(可以是超过 1 个顶点的路径)
g.V(node_id).emit().repeat(both().simplePath()).dedup()

exmaple: https://gremlify.com/jngpuy3dwg9示例:https://gremlify.com/jngpuy3dwg9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM