NEO4J Cypher Query: Relationship Direction Bug in Where Clause

Question

Sample Data:

Sample Query

CREATE (a1:A {title: "a1"})
CREATE (a2:A {title: "a2"})
CREATE (a3:A {title: "a3"})

CREATE (b1:B {title: "b1"})
CREATE (b2:B {title: "b2"})

MATCH (a:A {title: "a1"}), (b:B {title: "b1"})
CREATE (a)-[r:LINKS]->(b)

MATCH (a:A {title: "a2"}), (a1:A {title: "a1"}) 
CREATE (a)-[:CONNECTED]->(a1)

MATCH (a:A), (b:B) return a,b

Objective: Finding some connections in the where clause

Now lets write some variations to find A's not directly connected to B (a2 and b3)

// Q1. Both work fine
MATCH (a:A) WHERE (a)--(:B) RETURN a
MATCH (a:A) WHERE (:B)--(a) RETURN a

// Q2. Works
MATCH (a:A)-[r]-(b:B) WHERE (a)-[r]-(b) RETURN a

// Q3. Fails
MATCH (a:A)-[r]-(b:B) WHERE (b)-[r]-(a) RETURN a

Any idea why Q2, Q3 are not behaving the same way even if the direction is specified as bi-directional? Is this a NEO4J bug?

All credits to stdob at this answer for narrowing down the anomaly that was happening in my other query.

Update: Posted the same to the NEO4J GitHub issues

Update: NEO4J has accepted this as a bug are will be fixing it at 3.1

Answer 1

While this might not be a complete answer, it is too much info for a comment. This should hopefully provide some helpful insight though.

I would consider this a bug. Below are some variations of what should give the same results from the sample data. They should all pass with the given data (pass being return anything)

MATCH (a:A)-[r]-(b:B) WHERE (b)-[r]-(a) RETURN * -> fails

remove r
MATCH (a:A)--(b:B) WHERE (b)--(a) RETURN * -> pass
MATCH (a:A)-[r]-(b:B) WHERE (b)--(a) RETURN * -> pass

add direction
MATCH (a:A)-[r]-(b:B) WHERE (b)<-[r]-(a) RETURN * -> pass

reverse order
MATCH (a:A)-[r]-(b:B) WHERE (a)-[r]-(b) RETURN * -> pass

And, from the profile of the failed test

+---------------------+----------------+------+---------+-----------+--------------+
| Operator            | Estimated Rows | Rows | DB Hits | Variables | Other        |
+---------------------+----------------+------+---------+-----------+--------------+
| +ProduceResults     |              1 |    0 |       0 | a         | a            |
| |                   +----------------+------+---------+-----------+--------------+
| +SemiApply          |              1 |    0 |       0 | a, b, r   |              |
| |\                  +----------------+------+---------+-----------+--------------+
| | +ProjectEndpoints |              1 |    0 |       0 | a, b, r   | r, b, a      |
| | |                 +----------------+------+---------+-----------+--------------+
| | +Argument         |              2 |    1 |       0 | a, b, r   |              |
| |                   +----------------+------+---------+-----------+--------------+
| +Filter             |              2 |    1 |       1 | a, b, r   | a:A          |
| |                   +----------------+------+---------+-----------+--------------+
| +Expand(All)        |              2 |    1 |       3 | a, r -- b | (b)-[r:]-(a) |
| |                   +----------------+------+---------+-----------+--------------+
| +NodeByLabelScan    |              2 |    2 |       3 | b         | :B           |
+---------------------+----------------+------+---------+-----------+--------------+

and the equivalent passed test (reverse order)

+---------------------+----------------+------+---------+-----------+--------------+
| Operator            | Estimated Rows | Rows | DB Hits | Variables | Other        |
+---------------------+----------------+------+---------+-----------+--------------+
| +ProduceResults     |              1 |    1 |       0 | a         | a            |
| |                   +----------------+------+---------+-----------+--------------+
| +SemiApply          |              1 |    1 |       0 | a, b, r   |              |
| |\                  +----------------+------+---------+-----------+--------------+
| | +ProjectEndpoints |              1 |    0 |       0 | a, b, r   | r, a, b      |
| | |                 +----------------+------+---------+-----------+--------------+
| | +Argument         |              2 |    1 |       0 | a, b, r   |              |
| |                   +----------------+------+---------+-----------+--------------+
| +Filter             |              2 |    1 |       1 | a, b, r   | a:A          |
| |                   +----------------+------+---------+-----------+--------------+
| +Expand(All)        |              2 |    1 |       3 | a, r -- b | (b)-[r:]-(a) |
| |                   +----------------+------+---------+-----------+--------------+
| +NodeByLabelScan    |              2 |    2 |       3 | b         | :B           |
+---------------------+----------------+------+---------+-----------+--------------+

Notice the row count after step 1 in each. The same plan should not produce different results. I can speculate that is is a bug related to the graph pruning shortcuts (namely, once Neo4j traverses an edge in one direction, it will not traverse back on the same edge in the same match. This is an anti-cycle fail-safe/performance feature) So, in theory, after reversing the order in the where part from the match part, Neo4j has to traverse a pruned edge to validate the relationship. If it is the same direction, it auto-passes. If Neo4j tries to do the same check in reverse, it fails because that edge has been pruned. (This is just theory though. The validation that is failing is technically on the r validation in reverse)

NEO4J Cypher Query: Relationship Direction Bug in Where Clause

Question

1 answers

solution1
1 ACCPTED 2018-04-13 19:09:10

NEO4J Cypher Query: Relationship Direction Bug in Where Clause

Question

1 answers

solution1 1 ACCPTED 2018-04-13 19:09:10

solution1
1 ACCPTED 2018-04-13 19:09:10