I'm having a problem writing a query to suggest friends to my end users when their friends share more than one friend in common. Currently the schema thats being used is far from optimal, but my boss is adamant that I am not allowed to alter the table structure, even though I've told him that providing 2 columns for a friend relationship is much faster than one column
We currently have one pair of values for each friendship:
friendID | Entity_ID1 | Entity_Id2
1 2 3
2 1 4
3 2 5
Where I know that having an inverse for this column would make my query much simpler. So far I have devised the following query to attempt to find suggested friends for a user:
SELECT DISTINCT Entity_Id, Fb_Id, First_Name, Last_Name, Profile_Pic_Url, Last_CheckIn_Place, Category
FROM entity
JOIN friends F1
ON entity.Entity_Id = F1.Entity_Id2 OR entity.Entity_Id = F1.Entity_Id1
/* Friends of Friends */
WHERE F1.Entity_Id2 IN
(
SELECT Entity_Id1
FROM friends F
WHERE F.Entity_Id2 = :userId
AND F.Category != 4
UNION
SELECT Entity_Id2
FROM friends F
WHERE F.Entity_Id1 = :userId
AND F.Category != 4
)
/* Exclude my friends */
AND F1.Entity_Id1 NOT IN
(
SELECT Entity_Id1
FROM friends F
WHERE F.Entity_Id2 = :userId
AND F.Category != 4
UNION
SELECT Entity_Id2
FROM friends F
WHERE F.Entity_Id1 = :userId
AND F.Category != 4
)
/* Exclude self */
AND F1.Entity_Id1 != :userId
GROUP BY Entity_Id
/* Perform again for userId 2 */
UNION
SELECT DISTINCT Entity_Id, Fb_Id, First_Name, Last_Name, Profile_Pic_Url, Last_CheckIn_Place, Category
FROM entity
JOIN friends F2
ON entity.Entity_Id = F2.Entity_Id2 OR entity.Entity_Id = F2.Entity_Id1
WHERE F2.Entity_Id1 IN
(
SELECT Entity_Id1
FROM friends F
WHERE F.Entity_Id2 = :userId
AND F.Category != 4
UNION
SELECT Entity_Id2
FROM friends F
WHERE F.Entity_Id1 = :userId
AND F.Category != 4
)
/* Exclude my friends */
AND F2.Entity_Id2 NOT IN
(
SELECT Entity_Id1
FROM friends F
WHERE F.Entity_Id2 = :userId
AND F.Category != 4
UNION
SELECT Entity_Id2
FROM friends F
WHERE F.Entity_Id1 = :userId
AND F.Category != 4
)
AND F2.Entity_Id2 != :userId
GROUP BY Entity_Id
This sort of works, however it returns users that I am already friends with which is not what I want, I thought by having the NOT IN() clause for my friends, and then using UNION to merge the results, this would strip my friends out but apparently it does not.
What am I doing wrong here, and is there any way to make this query shorter without modifying the schema, right now it seems far to long and rather un-manageable.
Missing the reciprocal relationship does make this much harder. It requires checking both directions of the relationship. You seem to be pursing aa strategy of using union
to reconstruct both sides of the relationship.
Alternatively, you can use exists
and subqueries. The following version finds entities that are not friends and that have at least two friends in common using exists
:
select e.*
from entities e
where e.entity_id <> :user_id and
not exists (select 1
from friends f
where f.category <> 4 and
:user_id in (f.entity_id1, f.entity_id2) and
e.entity_id in (f.entity_id1, f.entity_id2)
) and
(select count(*)
from friends f1 join
friends f2
on f1.entity_id1 = f2.entity_id1 or
f1.entity_id1 = f2.entity_id2 or
f1.entity_id2 = f2.entity_id1 or
f1.entity_id1 = f2.entity_id2
where :user_id in (f1.entity_id1, f1.entity_id2, f2.entity_id1, f2.entity_id2) and
e.entity_id in (f1.entity_id1, f1.entity_id2, f2.entity_id1, f2.entity_id2)
) >= 2
Hopefully, you don't have too much data. Neither this version nor the version you are attempting will have good performance on larger amounts of data.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.