简体   繁体   中英

NEO4j Cypher query to return paths with a condition on the number of connected nodes

I would like to clean up my graph database a bit by removing unnecessary nodes. In one case, unnecessary nodes are nodes B between nodes A and C where B has the same name as node C and NO OTHER incoming relationships. I am having trouble coming up with a Cypher query that restricts the number of incoming edges.

The first part was easy:

MATCH (n1:TypeA)<-[r1:Inside]-(n2:TypeB)<-[r2:Inside]-(n3:TypeC)
WHERE n2.name = n3.name

Based on other SE questions (especially this one ) I then tried doing something like:

WITH n2, collect(r2) as rr
WHERE length(rr) = 1 
RETURN n2

but this also returned nodes with more than one incoming edge. It seems my WHERE clause on the length is not filtering the returned n2 nodes. I tried a few other things I found online, but they either returned nothing or were no longer syntactically correct in the current version.

After I find the n2 nodes that match the pattern, I'll want to connect n3 directly to n1 and DETACH DELETE n2 . Again, I was easily able to do that part when I didn't need the restriction on the number of incoming edges to n2 . That previous question has FOREACH (r IN rr | DELETE r) , but I want to detach delete the n2 nodes, not just those edges. I don't know how to correctly adapt this to operating on the nodes attached to the r s and I certainly want to be sure it's finding the correct nodes before deleting anything since Neo4j lacks basic undo functionality (but you can't put a RETURN command inside a FOREACH for some crazy reason).

How do I filter nodes on a path by the number of incoming edges using Cypher?

I think I can do this in py2neo by first collecting all the n1,n2,n3 triples matching the pattern, then going through each returned record and add them to a list if n2 has only one incoming edge. Then go through that list and perform the trimming operation, but if this can be done in pure Cypher, then I'd like to know how because I have a number of similar adjustments to make.

You need to pass along path in your WITH statement.

MATCH path = (n1:Ward)<-[r1:PARTOF]-(n2:Unknown)<-[r2:PARTOF]-(n3:Chome)
WHERE n2.name = n3.name
WITH path, size((n2)<-[:PARTOF]-()) as degree
WHERE degree = 1
RETURN path

Or shorter like this:

MATCH path = (n1:Ward)<-[r1:PARTOF]-(n2:Unknown)<-[r2:PARTOF]-(n3:Chome)
WHERE n2.name = n3.name
AND size((n2)<-[:PARTOF]-()) = 1
RETURN path

Borrowing some insight from this answer I came up with a kludge that seems to work.

MATCH path = (n1:Ward)<-[r1:PARTOF]-(n2:Unknown)<-[r2:PARTOF]-(n3:Chome)
WHERE n2.name = n3.name
WITH n2, size((n2)<-[:PARTOF]-()) as degree
WHERE degree = 1
MATCH (n1:Ward)<-[r1:PARTOF]-(n2:Unknown)<-[r2:PARTOF]-(n3:Chome)
RETURN n1,n2,n3

I expect that not all parts of that are needed and it isn't an efficient solution, but I don't know enough yet to improve upon it.

For example, I define the first line as path , but I can't use MATCH path in the penultimate line, and I have no idea why.

Also, if I write WITH size((n2)<-[:PARTOF]-()) as degree (dropping the n2, after WITH) it returns not only the n2 with degree > 1, but also all the nodes connected to them (even more than the n3 nodes). I don't know why it behaves like this and the Neo4j documentation for WITH has no examples or description to help me understand why the n2 is necessary here. Any improvements to my Cypher query or explanation of how or why it works would be greatly appreciated.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM