I am playing around with Neo4j and so far I have a geographical graph where an AIRPORT
is connect to a CITY
, the CITY
to a COUNTRY
and the COUNTRY
to a CONTINENT
, as depicted in the picture
Labels on the arrows translate to org.neo4j.graphdb.RelationshipType
into my code. So far, I can build the path between the start node MXP
to the end node LTN
using the following mono-directional traversal.
Traverser traverse = database.traversalDescription().depthFirst()
.relationships(CITY, BOTH)
.relationships(CONTINENT, BOTH)
.relationships(COUNTRY, BOTH)
.relationships(REGION, BOTH)
.evaluator(Evaluators.includeWhereEndNodeIs(endNode)).traverse(startNode);
With this, I get a single path MXP -> Milan -> Italy -> Europe <- England <- London <- LTN
, which is correct given the graph description, the traversal description and of course my understanding my understanding of such description.
I am trying to change this code to perform a bidirectional traversal, meaning I want to start from both MXP
and LTN
and stop at the collision point. I tried with the following snippet, where comments mean my understanding so it may easier to point out the problem.
TraversalDescription startSide = database.traversalDescription().depthFirst() //Depth first algorithm
.relationships(CITY, OUTGOING) //consider CITY relationship, only outgoing
.relationships(REGION, OUTGOING) //consider REGION relationship, only outgoing
.relationships(COUNTRY, OUTGOING) //consider COUNTRY relationship, only outgoing
.relationships(CONTINENT, OUTGOING) //consider CONTINENT relationship, only outgoing
.evaluator(Evaluators.excludeStartPosition()); //do not consider the starting point.
//Here I tried also with all, with the same result
//with includeWhereEndNodeIs(endNode), again with same result
//and combining includeWhereEndNodeIs and excludeStartPosition, once more with same result.
//All tries I mirrored for the endSide description, changing endNode to startNode where I feel it was needed
TraversalDescription endSide = database.traversalDescription().depthFirst()
.relationships(CITY, OUTGOING)
.relationships(REGION, OUTGOING)
.relationships(COUNTRY, OUTGOING)
.relationships(CONTINENT, OUTGOING)
.evaluator(Evaluators.excludeStartPosition());
List<Node> asList = Arrays.asList(startNode, endNode);
Traverser traverse = database.bidirectionalTraversalDescription().endSide(endSide).startSide(startSide).traverse(asList, asList);
Here, instead of the path I am getting with the monodirectional traversal try, I get two paths, one with only MXP
and one with only LTN
.
At this point I seriously believe I am completely misunderstanding the bidirectional traversal and maybe even its purpose. Where is my mistake? Why I do not get the same output?
I finally got a working solution. The problem in my code was related to the concept of uniqueness . Interesting points for my problem are
Sets the rules for how positions can be revisited during a traversal as stated in Uniqueness. Default if not set is NODE_GLOBAL.
NODE_GLOBAL uniqueness: No node in the entire graph may be visited more than once. This could potentially consume a lot of memory since it requires keeping an in-memory data structure remembering all the visited nodes.
NODE_PATH uniqueness: A node may not occur previously in the path reaching up to it.
These descriptions are somehow different from the official API so I played around trying different combination and ended up with the following code:
TraversalDescription bothSide = database.traversalDescription().depthFirst()
.relationships(CITY, OUTGOING)
.relationships(REGION, OUTGOING)
.relationships(COUNTRY, OUTGOING)
.relationships(CONTINENT, OUTGOING)
.uniqueness(NODE_PATH);
Traverser traverser = database
.bidirectionalTraversalDescription()
.startSide(bothSide)
.endSide(bothSide)
.traverse(node, endNode);
Basically, I defined a common TraversalDescription
for both the end and the start side, where I want to follow only OUTGOING relationships and I want to consider paths only where nodes are unique inside the path itself.
Then, I defined a bidirectional traverser
which simply sets up the end and the start side and traverses the graph from the starting node node
to the end node endNode
(well, actually it traverses from start to end AND from end to start at the same time and stops when the two traversal collide, merging the resulting paths into a single path leading from start
to end
).
NOTE : I am not completely sure about the meaning of NODE_GLOBAL
, since in my database each node represents a geographic entity, so each node in the path MXP -> Milan -> Italy -> Europe <- England <- London <- LTN
should be visited only once and thus there should be no difference between NODE_GLOBAL
and NODE_PATH
in this context.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.