简体   繁体   English

相互的朋友sql

[英]Mutual friends sql

I've seen multiple SO posts on mutual friends but I've structured my friends table in my db so that there are no duplicates eg (1,2) and not (2,1) 我在共同的朋友身上看到了多个SO帖子,但我在我的数据库中构建了我的朋友表,以便没有重复项,例如(1,2)而不是(2,1)

    Create Table Friends(
      user1_id int, 
      user2_id int
    );

and then a constraint to make sure user1 id is always smaller than user2 id eg 4 < 5 然后是一个约束,以确保user1 id始终小于user2 id,例如4 <5

Mutual friends sql with join (Mysql) 互助的朋友sql(Mysql)

I see suggestions that to find mutual friends it can be found using a join, so this is what I have but I think it's wrong because if I count the data in my db with the actual result from the query I get different results 我看到建议找到共同的朋友可以使用连接找到它,所以这就是我所拥有的,但我认为这是错误的,因为如果我用我的数据库中的数据计算查询的实际结果,我会得到不同的结果

select f1.user1_id as user1, f2.user1_id as user2, count(f1.user2_id) as 
mutual_count from Friends f1 JOIN Friends f2 ON 
f1.user2_id = f2.user2_id AND f1.user1_id <> f2.user1_id  GROUP BY
f1.user1_id, f2.user1_id order by mutual_count desc

There are three join scenarios that I can see. 我可以看到有三种连接方案。

1 -> 2 -> 3    (mutual friend id between other IDs)    
2 -> 3 -> 1    (mutual friend id > other IDs)    
2 -> 1 -> 3    (mutual friend id < other IDs)    

This can be resolved with this predicate... 可以用这个谓词来解决......

ON f1.user1_id IN (f2.user1_id, f2.user2_id)
OR f1.user2_id IN (f2.user1_id, f2.user2_id)
AND <not joining the row to Itself>

But that will totally mess up the optimiser's ability to use indexes. 但这将完全搞乱优化者使用索引的能力。

So, I'd union multiple queries. 所以,我会结合多个查询。

(pseudo code as I'm on a phone) (伪代码,因为我正在打电话)

SELECT u1, u2, COUNT(*) FROM
(
    SELECT f1.u1, f2.u2 FROM f1 INNER JOIN f2 ON f1.u2 = f2.u1 AND f1.u1 <> f2.u2
    UNION ALL
    SELECT f1.u1, f2.u1 FROM f1 INNER JOIN f2 ON f1.u2 = f2.u2 AND f1.u1 <> f2.u1
    UNION ALL
    SELECT f1.u2, f2.u2 FROM f1 INNER JOIN f2 ON f1.u1 = f2.u1 AND f1.u2 <> f2.u2
) all_combinations
GROUP BY u1, u2

Each individual query will then be able to fully utilise indexes. 然后,每个单独的查询将能够充分利用索引。 (Put one index on u1 and another index on u2 ) (在u1上放置一个索引,在u2上放置另一个索引)

The result should be less esoteric code (with fairly long CASE statements) and a much lower costed execution plan. 结果应该是更少的深奥代码(具有相当长的CASE语句)和更低成本的执行计划。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM