I am trying to figure out the following problem: I have two nodes :Merchant
and :Customer
. The two are related with a :BUY
relationship. I am trying to find :Merchant
nodes that have the same :Customer
nodes, or even better, that share let's say 90% of the :Customer
nodes. Thank you.
This helped me: https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-jaccard/
It's important that all nodes you want to compare are in the same graph and connected.
MATCH (p:Person)-[:LIKES]->(cuisine)
WITH {item:id(p), categories: collect(id(cuisine))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data)
YIELD item1, item2, count1, count2, intersection, similarity
RETURN algo.getNodeById(item1).name AS from, algo.getNodeById(item2).name AS to, intersection, similarity
ORDER BY similarity DESC
should be something like this, (it depends on your db)
MATCH (p:Merchant)-[:BUY]->(consumer)
WITH {item:id(p), categories: collect(id(consumer))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data)
YIELD item1, item2, count1, count2, intersection, similarity
WHERE similarity > 0.9
RETURN algo.getNodeById(item1).name AS from, algo.getNodeById(item2).name AS to, intersection, similarity
ORDER BY similarity DESC
By my understanding it uses the jaccard ( https://en.wikipedia.org/wiki/Jaccard_index ) to the id of the node.
PS: It is important to install the plugin to use it: https://neo4j.com/docs/graph-algorithms/current/introduction/#_installation
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.