简体   繁体   中英

neo4j find nodes with similar connections

I am trying to figure out the following problem: I have two nodes :Merchant and :Customer . The two are related with a :BUY relationship. I am trying to find :Merchant nodes that have the same :Customer nodes, or even better, that share let's say 90% of the :Customer nodes. Thank you.

This helped me: https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-jaccard/

It's important that all nodes you want to compare are in the same graph and connected.

MATCH (p:Person)-[:LIKES]->(cuisine)
WITH {item:id(p), categories: collect(id(cuisine))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data)
YIELD item1, item2, count1, count2, intersection, similarity
RETURN algo.getNodeById(item1).name AS from, algo.getNodeById(item2).name AS to, intersection, similarity
ORDER BY similarity DESC

should be something like this, (it depends on your db)

MATCH (p:Merchant)-[:BUY]->(consumer)
WITH {item:id(p), categories: collect(id(consumer))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data)
YIELD item1, item2, count1, count2, intersection, similarity
WHERE similarity > 0.9
RETURN algo.getNodeById(item1).name AS from, algo.getNodeById(item2).name AS to, intersection, similarity
ORDER BY similarity DESC

By my understanding it uses the jaccard ( https://en.wikipedia.org/wiki/Jaccard_index ) to the id of the node.

PS: It is important to install the plugin to use it: https://neo4j.com/docs/graph-algorithms/current/introduction/#_installation

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM