简体   繁体   中英

Neo4j Cypher: Find common nodes between a set of matched nodes

Very similar to the question posted here

I have the following nodes: Article and Words. Each word is connected to an article by a MENTIONED relationship.

I need to query all articles that have common words where the list of common words is dynamic. From the clients perspective, I am passing back a list of words and expecting back a results of articles that have those words in common.

The following query does the job

WITH ["orange", "apple"] as words
MATCH (w:Word)<-[:MENTIONED]-(a:Article)-[:MENTIONED]->(w2:Word)
WHERE w.name IN words AND w2.name IN words
RETURN a, w, w2

but does not work with word list of one. How can I make it handle any number of words? Is there a better way to do this?

Yes. There are two approaches I can think of:

  1. Finding all articles that contain some subset of those words, and then returning only articles where the number of words mentioned is the number of words you supplied in your wordlist.

  2. Getting the :Word nodes for the given list of words, and then getting articles where all words are mentioned in the article.

Here's an example graph to test this on:

MERGE (a1:Article {name:'a1'}), 
      (a2:Article {name:'a2'}), 
      (a3:Article {name:'a3'})
MERGE (w1:Word{name:'orange'}), 
      (w2:Word{name:'apple'}), 
      (w3:Word{name:'pineapple'}), 
      (w4:Word{name:'banana'})
MERGE (a1)-[:MENTIONED]->(w1), 
      (a1)-[:MENTIONED]->(w2), 
      (a1)-[:MENTIONED]->(w3), 
      (a1)-[:MENTIONED]->(w4),
      (a2)-[:MENTIONED]->(w1), 
      (a2)-[:MENTIONED]->(w4),
      (a3)-[:MENTIONED]->(w1), 
      (a3)-[:MENTIONED]->(w2),
      (a3)-[:MENTIONED]->(w3)

Approach 1, comparing the wordlist size to the number of words mentioned in the article, looks like this:

WITH ["orange", "apple"] as words
MATCH (word:Word)<-[:MENTIONED]-(article:Article)
WHERE word.name IN words
WITH words, article, COUNT(word) as wordCount
WHERE wordCount = SIZE(words)
RETURN article

This only works if there is ever only one :MENTIONED relationship between an article and a mentioned word, no matter how many times that word is mentioned.

Approach 2 is using ALL() on the collection of :Words to ensure that we match on an article where all words are mentioned:

WITH ["orange", "apple"] as words
MATCH (word:Word) 
WHERE word.name in words
WITH COLLECT(word) as words
MATCH (article:Article)
WHERE ALL (word in words WHERE (word)<-[:MENTIONED]-(article))
RETURN article

You can try using PROFILE with each of these to figure out which works best with your data set.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM