[英]Computing similarity between all nodes neo4j - getting different values for a node pair
I have two kinds of nodes in my database: 我的数据库中有两种节点:
And one relationship - "LIKES" 还有一种关系-“喜欢”
The relationship between the two nodes is described like so: 两个节点之间的关系描述如下:
(:USER)-[:LIKES]->(:MEDIA) (:USER)-[:LIKES]->(:MEDIA)
I'm trying to compute the similarity between all the "USER" nodes based on the number of media shared between each node pair (Jaccard Similarity) 我正在尝试根据每个节点对之间共享的媒体数来计算所有“ USER”节点之间的相似度(Jaccard相似度)
This similarity is then stored as a "ISSIMILAR" relationship. 然后将这种相似性存储为“ ISSIMILAR”关系。 The "ISSIMILAR" relationship has an attribute called "similarity" which stores the similarity between nodes
“ ISSIMILAR”关系具有一个称为“相似性”的属性,该属性存储节点之间的相似性
Here's my query: 这是我的查询:
Match(u:User)
WITH COLLECT(u) as users
UNWIND users as user
MATCH(user:User{id:user.id})-[:LIKES]->(common_media:Media)<-[:LIKES]-(other:User)
WITH user,other,count(common_media) AS intersection, COLLECT(common_media.name) as i
MATCH(user)-[:LIKES]->(user_media:Media)
WITH user,other,intersection,i, COLLECT(user_media.name) AS s1
MATCH(other)-[:LIKES]->(other_media:Media)
WITH user,other,intersection,i,s1, COLLECT(other_media.name) AS s2
WITH user,other,intersection,s1,s2
WITH user,other,intersection,s1+filter(x IN s2 WHERE NOT x IN s1) AS union, s1,s2
WITH ((1.0*intersection)/SIZE(union)) as jaccard,user,other
MERGE(user)-[:ISSIMILAR{similarity:jaccard}]-(other)
Running this query, I have two issues: 运行此查询,我有两个问题:
Here's a visualization of the issue: 这是问题的可视化:
MATCH(user:User)-[r]-(o:User) return o,user,r limit 4
Thanks in advance 提前致谢
Problems with two similarity relationships arise because you do not exclude the previously constructed similarity relations. 出现两个相似关系的问题是因为您不排除先前构造的相似关系。 You can avoid this by doing:
您可以通过执行以下操作来避免这种情况:
...
UNWIND users as user
UNWIND users as other
WITH user, other WHERE ID(user) > ID(other)
MATCH(user)-[:LIKES]->(common_media:Media)<-[:LIKES]-(other)
...
And the final query can be made more clear: 最后的查询可以变得更加清晰:
MATCH (u:User) WITH COLLECT(u) AS users
UNWIND users AS user
UNWIND users AS other
MATCH (user)-[:LIKES]->(common_media:Media)<-[:LIKES]-(other) WHERE ID(other) > ID(user)
WITH user, other, COLLECT(common_media) AS intersection
MATCH (user)-[:LIKES]->(user_media:Media)
WITH user, other, intersection,
COLLECT(user_media) AS s1
MATCH (other)-[:LIKES]->(other_media:Media)
WITH user,other,intersection, s1,
COLLECT(other_media) AS s2
RETURN user, other,
(1.0 * SIZE(intersection)) / (SIZE(s1) + SIZE(s2) - SIZE(intersection)) AS jaccard
MERGE (user)-[:ISSIMILAR {similarity: jaccard}]->(other)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.