简体   繁体   中英

MySQL Query to find friends and number of mutual friends

I have looked through the questions but I cant find anything that does exactly what I need and I can't figure out how to do it myself.

I have 2 tables, a user table and a friend link table. The user table is a table of all my users:

    +---------+------------+---------+---------------+
    | user_id | first_name | surname |     email     |
    +---------+------------+---------+---------------+
          1         joe       bloggs    joe@test.com
          2         bill      bloggs    bill@test.com
          3         john      bloggs    john@test.com
          4         karl      bloggs    karl@test.com

My friend links table then shows all relationships between the users, for example:

    +--------=+---------+-----------+--------+
    | link_id | user_id | friend_id | status |
    +---------+---------+-----------+--------+
       1         1          3           a
       2         3          1           a
       3         4          3           a
       4         3          4           a
       5         2          3           a
       6         3          2           a

As a note the a in the status column means approved, there could also be r(request) and d(declined).

What I want to do is have a query where if a user does a search it will bring back a list of users that they are currently not already friends with and how many mutual friends each user has with them.

I have managed to get a query for all users that are currently not friends with them. So if the user doing the search had the user id of 1:

SELECT u.user_id,u.first_name,u.surname
FROM users u
    LEFT JOIN friend_links fl
        ON u.user_id = fl.user_id AND 1 IN (fl.friend_id)
WHERE fl.friend_id IS NULL
AND u.user_id != 1
AND surname LIKE 'bloggs'

How then do I have a count of the number of mutual friends for each returned user?

EDIT:

Just as an edit as I don't think I am being particularly clear with my question.

The query that I currently have above will produce the following set of results:

    +---------+------------+---------+
    | user_id | first_name | surname |
    +---------+------------+---------+
          2         bill      bloggs
          4         karl      bloggs

Those are the users matching the surname bloggs that are not currently friends with joe bloggs (user id 1).

Then I want to have how many mutual friends each of these users has with the user doing the search so the returned results would look like:

    +---------+------------+---------+--------+
    | user_id | first_name | surname | mutual |
    +---------+------------+---------+--------+
          2         bill      bloggs     1
          4         karl      bloggs     1

Each of these returned users has 1 mutual friend as joe bloggs (user id 1) is friends with john bloggs and john bloggs is friends with both returned users.

I hope this is a bit more clear.

Thanks.

Mutual friends can be found by joining the friend_links table to itself on the friend_id field like so:

SELECT *
FROM friend_links f1 INNER JOIN friend_links f2
  ON f1.friend_id = f2.friend_id
WHERE f1.user_id = $person1
  AND f2.user_id = $person2

But bear in mind that this, in its worst case, is essentially squaring the number of rows in the friend_links table and can pretty easily jack up your server once you have a non-trivial number of rows. A better option would be to use 2 sub-queries for each user and then join the results of those.

SELECT *
FROM (
  SELECT *
  FROM friend_links
  WHERE user_id = $person1
) p1 INNER JOIN (
  SELECT *
  FROM friend_links
  WHERE user_id = $person1
) p2
  ON p1.friend_id = p2.friend_id

Also, you can simplify your friend_links table by removing the surrogate key link_id and just making (user_id,friend_id) the primary key since they must be unique anyway.


Edit:

How would this be applied to the original query of searching for users that aren't already friends, I would like to do both in a single query if possible?

SELECT f2.user_id, COUNT(*) 'friends_in_common'
FROM friend_links f1 LEFT JOIN friend_links f2
  ON f1.friend_id = f2.friend_id
WHERE f1.user_id = $person
GROUP BY f2.user_id
ORDER BY friends_in_common DESC
LIMIT $number

I am also thinking that the user_id constraints can be moved from the WHERE clause into the JOIN conditions to reduce the size of the data set created by the self-join and preclude the use of subqueries like in my second example.

This query lists anyone who's not friend with user 1 and whose surname matches '%bloggs%' :

SELECT
  users.user_id,
  users.first_name,
  users.surname,
  Sum(IF(users.user_id = friend_links_1.friend_id, 1, 0)) As mutual
FROM
  users inner join
    (friend_links INNER JOIN friend_links friend_links_1
     ON friend_links.friend_id = friend_links_1.user_id)
  ON friend_links.user_id=1 AND users.user_id<>1
WHERE
  users.surname LIKE '%bloggs%'
GROUP BY
  users.user_id, users.first_name, users.surname
HAVING
  Sum(IF(users.user_id = friend_links.friend_id, 1, 0))=0

just change the user id on the ON clause, and the surname on the WHERE clause. I think it should work correctly now!

If A is friend of B, then B is also a friend of A? Wouldn't it be better to use just a link instead of two links (and instead of two rows in friends_links)? Then you have to use two status columns, status1 and status2, and A is friend of B only if status1 = status2 = "a".

There are many ways to show mutual friends, eg:

SELECT friend_id
FROM friend_links
WHERE friend_links.user_id = $user1 or friend_links.user_id = $user2
  AND NOT (friend_links.friend_id = $user1 or friend_links.friend_id = $user2)
GROUP BY friend_id
HAVING Count(*)>1

And this query shows for each user and anyone who's not his/her friend:

SELECT
  users.user_id,
  users.first_name,
  users_1.user_id,
  users_1.first_name
FROM
  users INNER JOIN users users_1 ON users.user_id <> users_1.user_id
WHERE
  NOT EXISTS (SELECT *
              FROM friend_links
              WHERE
                friend_links.user_id = users.user_id
                AND friend_links.friend_id = users_1.user_id)

(The only think I didn't check is the friendship status, but it's easy to add that check).

I'm still working on it, but it's not easy to combine nicely these two queries togheter. So this isn't exactly an answer, I'm just showing some ideas that i've tried.

But what do you need exactly? A query that returns every user with anyone who's not his/her friend and the number of friends in common, or is the user_id already given?

With some code it's not a problem to answer your question... but there has to be a nice way just by using SQL! :)

EDIT:

I'm still wondering if there's a better solution to this, in particular the next query could be extremely slow, but it looks like this might work:

SELECT
  users_1.user_id,
  users_2.user_id,
  Sum(IF(users_1.user_id = friend_links.user_id AND users_2.user_id = friend_links_1.friend_id, 1, 0)) As CommonFriend
FROM
  users users_1 INNER JOIN users users_2
    ON users_1.user_id <> users_2.user_id,
  (friend_links INNER JOIN friend_links friend_links_1
    ON friend_links.friend_id = friend_links_1.user_id)
GROUP BY
  users_1.user_id,
  users_2.user_id
HAVING
  Sum(IF(users_1.user_id = friend_links.user_id AND users_2.user_id = friend_links.friend_id, 1, 0))=0

(as before, i didn't check friendship status)

If user is given, you could put WHERE users_1.user_id=$user1 but it's better to just leave one user table, and filter the next INNER JOIN whith that user.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM