简体   繁体   中英

Find and count all possible paths starting from one node not exceeding specific length in Cypher Neo4j

I have a graph with several nodes connected with two possible relationships: NextExp and PrevExp. Each relationship has a specific ID, so it is possible to have multiple relationships between two nodes.

   ID1      ID1     ID1  
SD ---> SSD ---> LD ---> CEO 
   ID2      ID2
SD ---> SSD ---> LD
   ID2      ID2
SD ---> SSD ---> VF
            ID2
        SSD ---> BO
   ID3      ID3     ID3
SD ---> ST  ---> BO ---> CTO
   ID4      ID4
SD ---> ST  ---> MD
            ID5
        ST  ---> BB

I want to find (actually count) all possible paths with specific lengths starting from node SD, counting only a specific relationship (ie NextExp for instance). For the example above, if the max length is 3, I would like to have something like this:

Length   |       Paths       |   Count
2         SD --> SSD           2
          SD --> ST            2
3         SD --> SSD --> LD    2
          SD --> SSD --> VF    1
          SD --> ST  --> BO    1
          SD --> ST  --> MD    1

Being new in Neo4j, I tried using MATCH but could not find how to not specify an ending node when searching for paths.

I might be overcomplicating this, but given some sample data that looks close to your diagram:

MERGE (sd: Node { name: 'SD' })
MERGE (ssd: Node { name: 'SSD' })
MERGE (ld: Node { name : 'LD' })
MERGE (ceo: Node { name: 'CEO' })
MERGE (cto: Node { name: 'CTO' })
MERGE (st: Node { name: 'ST' })
MERGE (vf: Node { name: 'VF' })
MERGE (bo: Node { name: 'BO' })
MERGE (md: Node { name: 'MD' })
MERGE (bb: Node { name: 'BB' })
MERGE (sd)-[:NEXTEXP { id: 1 }]->(ssd)
MERGE (ssd)-[:NEXTEXP { id: 1 }]->(ld)
MERGE (ld)-[:NEXTEXP { id: 1 }]->(ceo)
MERGE (sd)-[:NEXTEXP { id: 2 }]->(ssd)
MERGE (ssd)-[:NEXTEXP { id: 2 }]->(ld)
MERGE (ssd)-[:NEXTEXP { id: 2 }]->(vf)
MERGE (ssd)-[:NEXTEXP { id: 2 }] ->(bo)
MERGE (sd)-[:NEXTEXP { id: 3 }]->(st)
MERGE (st)-[:NEXTEXP { id: 3 }]->(bo)
MERGE (bo)-[:NEXTEXP { id: 3 }]->(cto)
MERGE (sd)-[:NEXTEXP { id: 4 }]->(st)
MERGE (st)-[:NEXTEXP { id: 4 }]->(md)
MERGE (st)-[:NEXTEXP { id: 5 }]->(bb)

The following appears to yield the right sort of answer:

 MATCH path=(s: Node { name: 'SD' })-[:NEXTEXP*1..2]->(other: Node)
  WITH distinct nodes(path) as pathNodes, 
       relationships(path) as pathRels, 
       reduce(minValue = head(relationships(path)).id, x in relationships(path) | (case when x.id = minValue then x.id else null end)) as pathRelIds
 WHERE pathRelIds IS NOT NULL
  WITH pathNodes, count(pathNodes) as ct, size(pathNodes) as pathLength
RETURN pathLength as `Length`, pathNodes as `Paths`, ct as `Count`
 ORDER BY `Length`, `Paths`, `Count`

The query assumes that you need the ID of the relationships to be the same throughout the path. This seemed to be what you're asking since in your 'expected' table SD -> SSD -> LD has a count of 2, rather than 3.

Breaking it down a bit:

  • We match for between 1 and 2 relationships and assign the resulting paths to a variable called path
  • The reduce call in the second line tries to return either the smallest relationship ID within the path, or NULL if there are multiple relationship IDs - it's a bit of a cheat that relies on how Cypher treats NULL in expressions
  • The third line eliminates those cases where the previous reduce returned NULL - that is, we eliminate paths where there's more than one ID on the relationships in the path

For the exact output you're asking for (where the names of nodes are joined into a single string), the following amended query:

 MATCH path=(s: Node { name: 'SD' })-[:NEXTEXP*1..2]->(other: Node)
  WITH distinct s, 
       nodes(path) as pathNodes, 
       relationships(path) as pathRels, 
       reduce(minValue = head(relationships(path)).id, x in relationships(path) | (case when x.id = minValue then x.id else null end)) as pathRelIds
 WHERE pathRelIds IS NOT NULL
  WITH s, pathNodes, count(pathNodes) as ct, size(pathNodes) as pathLength
RETURN pathLength as `Length`, 
       reduce(pathStr = s.name, x in pathNodes[1..] | pathStr + ' --> ' + x.name) as `Paths`, 
       ct as `Count`
 ORDER BY `Length`, `Count` DESC, `Paths`

Yields the following output:

╒════════╤═══════════════════╤═══════╕
│"Length"│"Paths"            │"Count"│
╞════════╪═══════════════════╪═══════╡
│2       │"SD --> SSD"       │2      │
├────────┼───────────────────┼───────┤
│2       │"SD --> ST"        │2      │
├────────┼───────────────────┼───────┤
│3       │"SD --> SSD --> LD"│2      │
├────────┼───────────────────┼───────┤
│3       │"SD --> SSD --> BO"│1      │
├────────┼───────────────────┼───────┤
│3       │"SD --> SSD --> VF"│1      │
├────────┼───────────────────┼───────┤
│3       │"SD --> ST --> BO" │1      │
├────────┼───────────────────┼───────┤
│3       │"SD --> ST --> MD" │1      │
└────────┴───────────────────┴───────┘

Which differs from your expected only by the SD --> SSD --> BO row, which is probably due to me misinterpreting your diagram.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM