简体   繁体   English

Neo4j双向遍历API

[英]Neo4j bidirectional traversal api

I am playing around with Neo4j and so far I have a geographical graph where an AIRPORT is connect to a CITY , the CITY to a COUNTRY and the COUNTRY to a CONTINENT , as depicted in the picture 我正在玩Neo4j,到目前为止,我有一个地理图,其中AIRPORT连接到CITYCITY连接到COUNTRYCOUNTRYCONTINENT ,如图所示 在此处输入图片说明

Labels on the arrows translate to org.neo4j.graphdb.RelationshipType into my code. 箭头上的标签将org.neo4j.graphdb.RelationshipType转换为我的代码。 So far, I can build the path between the start node MXP to the end node LTN using the following mono-directional traversal. 到目前为止,我可以使用以下单向遍历构建起始节点MXP到终止节点LTN之间的路径。

Traverser traverse = database.traversalDescription().depthFirst()
  .relationships(CITY, BOTH)
  .relationships(CONTINENT, BOTH)
  .relationships(COUNTRY, BOTH)
  .relationships(REGION, BOTH)
  .evaluator(Evaluators.includeWhereEndNodeIs(endNode)).traverse(startNode);

With this, I get a single path MXP -> Milan -> Italy -> Europe <- England <- London <- LTN , which is correct given the graph description, the traversal description and of course my understanding my understanding of such description. 这样,我得到了一条单一路径MXP -> Milan -> Italy -> Europe <- England <- London <- LTN ,这对于图形描述,遍历描述以及我对这种描述的理解当然是正确的。

I am trying to change this code to perform a bidirectional traversal, meaning I want to start from both MXP and LTN and stop at the collision point. 我正在尝试更改此代码以执行双向遍历,这意味着我想同时从MXPLTN并在冲突点处停止。 I tried with the following snippet, where comments mean my understanding so it may easier to point out the problem. 我尝试使用以下代码段,其中的注释表示我的理解,因此可能更容易指出问题。

TraversalDescription startSide = database.traversalDescription().depthFirst() //Depth first algorithm
  .relationships(CITY, OUTGOING) //consider CITY relationship, only outgoing
  .relationships(REGION, OUTGOING) //consider REGION relationship, only outgoing
  .relationships(COUNTRY, OUTGOING) //consider COUNTRY relationship, only outgoing
  .relationships(CONTINENT, OUTGOING) //consider CONTINENT relationship, only outgoing
  .evaluator(Evaluators.excludeStartPosition()); //do not consider the starting point. 
                                               //Here I tried also with all, with the same result
                                               //with includeWhereEndNodeIs(endNode), again with same result
                                               //and combining includeWhereEndNodeIs and excludeStartPosition, once more with same result.
                                               //All tries I mirrored for the endSide description, changing endNode to startNode where I feel it was needed

TraversalDescription endSide = database.traversalDescription().depthFirst()
  .relationships(CITY, OUTGOING)
  .relationships(REGION, OUTGOING)
  .relationships(COUNTRY, OUTGOING)
  .relationships(CONTINENT, OUTGOING)
  .evaluator(Evaluators.excludeStartPosition());

List<Node> asList = Arrays.asList(startNode, endNode);
Traverser traverse = database.bidirectionalTraversalDescription().endSide(endSide).startSide(startSide).traverse(asList, asList);

Here, instead of the path I am getting with the monodirectional traversal try, I get two paths, one with only MXP and one with only LTN . 在这里,我得到了两条路径,而不是单向遍历尝试得到的路径,一条只有MXP ,一条只有LTN

At this point I seriously believe I am completely misunderstanding the bidirectional traversal and maybe even its purpose. 在这一点上,我深信我完全误解了双向遍历,甚至可能是它的目的。 Where is my mistake? 我的错误在哪里? Why I do not get the same output? 为什么我没有得到相同的输出?

I finally got a working solution. 我终于有了一个可行的解决方案。 The problem in my code was related to the concept of uniqueness . 我的代码中的问题与唯一性的概念有关。 Interesting points for my problem are 我的问题有趣的一点是

Sets the rules for how positions can be revisited during a traversal as stated in Uniqueness. 设置规则,以便在遍历期间重新访问位置,如“唯一性”中所述。 Default if not set is NODE_GLOBAL. 如果未设置,则默认为NODE_GLOBAL。

NODE_GLOBAL uniqueness: No node in the entire graph may be visited more than once. NODE_GLOBAL唯一性:整个图形中的任何节点都不能被多次访问。 This could potentially consume a lot of memory since it requires keeping an in-memory data structure remembering all the visited nodes. 这可能会消耗大量内存,因为它需要保持内存中的数据结构来记住所有访问的节点。

NODE_PATH uniqueness: A node may not occur previously in the path reaching up to it. NODE_PATH唯一性:节点可能不会在到达它的路径中先前出现。

These descriptions are somehow different from the official API so I played around trying different combination and ended up with the following code: 这些描述与官方API有所不同,因此我尝试使用不同的组合,并得到以下代码:

TraversalDescription bothSide = database.traversalDescription().depthFirst()
    .relationships(CITY, OUTGOING)
    .relationships(REGION, OUTGOING)
    .relationships(COUNTRY, OUTGOING)
    .relationships(CONTINENT, OUTGOING)
    .uniqueness(NODE_PATH);

Traverser traverser = database
    .bidirectionalTraversalDescription()
    .startSide(bothSide)
    .endSide(bothSide)
    .traverse(node, endNode);

Basically, I defined a common TraversalDescription for both the end and the start side, where I want to follow only OUTGOING relationships and I want to consider paths only where nodes are unique inside the path itself. 基本上,我为末尾和起始端都定义了一个通用的TraversalDescription ,在这里我只想遵循传出关系,并且我只想考虑路径本身在节点内唯一的路径。

Then, I defined a bidirectional traverser which simply sets up the end and the start side and traverses the graph from the starting node node to the end node endNode (well, actually it traverses from start to end AND from end to start at the same time and stops when the two traversal collide, merging the resulting paths into a single path leading from start to end ). 然后,我定义了一个bidirectional traverser ,该bidirectional traverser器简单地设置了端点和起始端,并从起始节点node到终止节点endNode遍历了图形(实际上,它从起点到终点以及从终点到起点同时遍历并在两个遍历发生冲突时停止,从而将生成的路径合并为从startend的单个路径)。

NOTE : I am not completely sure about the meaning of NODE_GLOBAL , since in my database each node represents a geographic entity, so each node in the path MXP -> Milan -> Italy -> Europe <- England <- London <- LTN should be visited only once and thus there should be no difference between NODE_GLOBAL and NODE_PATH in this context. 注意 :我不确定NODE_GLOBAL的含义,因为在我的数据库中,每个节点都代表一个地理实体,因此路径MXP -> Milan -> Italy -> Europe <- England <- London <- LTN NODE_GLOBAL MXP -> Milan -> Italy -> Europe <- England <- London <- LTN NODE_GLOBAL只能被访问一次,因此在这种情况下NODE_GLOBALNODE_PATH之间应该没有区别。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM