简体   繁体   English

SQL:建议的朋友具有1个分离度,其中我的朋友共享两个以上的共同朋友

[英]SQL: Suggested friends with 1 degree of separation where my friends share more than 2 mutual friends

I'm having a problem writing a query to suggest friends to my end users when their friends share more than one friend in common. 当我的最终用户的朋友共享一个以上的朋友时,我在编写查询以向我的最终用户推荐朋友时遇到问题。 Currently the schema thats being used is far from optimal, but my boss is adamant that I am not allowed to alter the table structure, even though I've told him that providing 2 columns for a friend relationship is much faster than one column 当前使用的模式远非最佳,但我的老板坚决不允许我更改表结构,即使我告诉他为朋友关系提供2列比一列要快得多

We currently have one pair of values for each friendship: 目前,每个友谊都有一对价值观:

friendID  |  Entity_ID1  |  Entity_Id2
   1              2             3
   2              1             4
   3              2             5

Where I know that having an inverse for this column would make my query much simpler. 我知道对这一列取反将使我的查询简单得多。 So far I have devised the following query to attempt to find suggested friends for a user: 到目前为止,我已经设计了以下查询来尝试为用户找到建议的朋友:

  SELECT DISTINCT Entity_Id, Fb_Id, First_Name, Last_Name, Profile_Pic_Url, Last_CheckIn_Place, Category
  FROM entity
  JOIN friends F1
  ON entity.Entity_Id = F1.Entity_Id2 OR entity.Entity_Id = F1.Entity_Id1
  /* Friends of Friends */
  WHERE F1.Entity_Id2 IN
  (
    SELECT Entity_Id1
      FROM friends F
     WHERE F.Entity_Id2 = :userId
       AND F.Category != 4

     UNION

     SELECT Entity_Id2
      FROM friends F
     WHERE F.Entity_Id1 = :userId
       AND F.Category != 4
  )
  /* Exclude my friends */
  AND F1.Entity_Id1 NOT IN
  (
    SELECT Entity_Id1
      FROM friends F
     WHERE F.Entity_Id2 = :userId
       AND F.Category != 4

     UNION

     SELECT Entity_Id2
      FROM friends F
     WHERE F.Entity_Id1 = :userId
       AND F.Category != 4
  )
  /* Exclude self */
  AND F1.Entity_Id1 != :userId
  GROUP BY Entity_Id

  /* Perform again for userId 2 */
  UNION

  SELECT DISTINCT Entity_Id, Fb_Id, First_Name, Last_Name, Profile_Pic_Url, Last_CheckIn_Place, Category
  FROM entity
  JOIN friends F2
  ON entity.Entity_Id = F2.Entity_Id2 OR entity.Entity_Id = F2.Entity_Id1
  WHERE F2.Entity_Id1 IN
  (
    SELECT Entity_Id1
      FROM friends F
     WHERE F.Entity_Id2 = :userId
       AND F.Category != 4

     UNION

     SELECT Entity_Id2
      FROM friends F
     WHERE F.Entity_Id1 = :userId
       AND F.Category != 4
  )
  /* Exclude my friends */
  AND F2.Entity_Id2 NOT IN
  (
    SELECT Entity_Id1
      FROM friends F
     WHERE F.Entity_Id2 = :userId
       AND F.Category != 4

     UNION

     SELECT Entity_Id2
      FROM friends F
     WHERE F.Entity_Id1 = :userId
       AND F.Category != 4
  )
  AND F2.Entity_Id2 != :userId
  GROUP BY Entity_Id

This sort of works, however it returns users that I am already friends with which is not what I want, I thought by having the NOT IN() clause for my friends, and then using UNION to merge the results, this would strip my friends out but apparently it does not. 这种类型的作品,但是它返回的用户已经不是我想要的朋友了,我认为通过为我的朋友使用NOT IN()子句,然后使用UNION合并结果,可以删除我的朋友出来,但显然没有。

What am I doing wrong here, and is there any way to make this query shorter without modifying the schema, right now it seems far to long and rather un-manageable. 我在这里做错了什么,有什么办法可以在不修改模式的情况下缩短查询时间,现在看来很长,而且难以管理。

Missing the reciprocal relationship does make this much harder. 缺少互惠关系确实使这变得更加困难。 It requires checking both directions of the relationship. 它需要检查关系的两个方向。 You seem to be pursing aa strategy of using union to reconstruct both sides of the relationship. 您似乎正在寻求使用union来重建关系双方的策略。

Alternatively, you can use exists and subqueries. 另外,您可以使用exists和子查询。 The following version finds entities that are not friends and that have at least two friends in common using exists : 以下版本查找不是朋友的实体,并且至少exists两个共同使用的朋友:

select e.*
from entities e
where e.entity_id <> :user_id and
      not exists (select 1
                  from friends f
                  where f.category <> 4 and
                        :user_id in (f.entity_id1, f.entity_id2) and
                        e.entity_id in (f.entity_id1, f.entity_id2)
                 ) and
      (select count(*)
       from friends f1 join
            friends f2
            on f1.entity_id1 = f2.entity_id1 or
               f1.entity_id1 = f2.entity_id2 or
               f1.entity_id2 = f2.entity_id1 or
               f1.entity_id1 = f2.entity_id2
       where :user_id in (f1.entity_id1, f1.entity_id2, f2.entity_id1, f2.entity_id2) and
             e.entity_id in (f1.entity_id1, f1.entity_id2, f2.entity_id1, f2.entity_id2)
      ) >= 2

Hopefully, you don't have too much data. 希望您没有太多数据。 Neither this version nor the version you are attempting will have good performance on larger amounts of data. 此版本或您尝试使用的版本都无法在大量数据上获得良好的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM