[英]COUNT in LEFT JOIN returning duplicated value
I've the following tables (example): 我有以下表格(示例):
users : 用户 :
id | user | photo | joined | country
1 | Igor | abc.jpg | 2015 | Brazil
2 | John | cga.png | 2014 | USA
3 | Lucas| hes.jpg | 2016 | Japan
posts (see that there are two lines with author = Igor
and ft = 2
and one line with author = Igor
and ft = 3
and Igor have three posts): 帖 (看到有两行与
author = Igor
和ft = 2
,并用一条线author = Igor
和ft = 3
和Igor具有三个柱):
id | author | content | date | ft (2 = photos and 3 = videos)
1 | Igor | hi | 2016 | 2
2 | Igor | hello | 2016 | 3
3 | John | hehehe | 2016 | 2
4 | Igor | huhuhuh | 2016 | 2
5 | Lucas | lol | 2016 | 3
friendship (when status = 2
means that they are friends): 友谊 (
status = 2
表示他们是朋友时):
id | friend1 | friend2 | status
1 | Igor | Lucas | 2
2 | Lucas | John | 2
3 | John | Igor | 2
And I want to do a COUNT
of posts with ft = 2
and a COUNT
of friends ( status = 2
) according to the currently logged user (Igor, in this case). 我想根据当前登录的用户(在这种情况下为Igor)执行
ft = 2
的COUNT
个帖子和COUNT
个朋友( status = 2
)的帖子 。
So, I do (assuming that the current user logged in is Igor ): 因此,我做了(假设当前登录的用户是Igor ):
SELECT photo, joined, country, sum(CASE WHEN ft = 2 THEN 1 ELSE 0 END) AS numPhotos, sum(CASE WHEN ft = 3 THEN 1 ELSE 0 END) AS numVideos
FROM users
LEFT JOIN posts
ON users.user = posts.author
WHERE users.user = 'Igor'
GROUP BY users.user
LIMIT 1
And when I check on a foreach
, the data is correct: numPhotos = 2
and numVideos = 1
. 当我检查
foreach
,数据是正确的: numPhotos = 2
和numVideos = 1
。
But, I want to select too the number of friends, so, I do: 但是,我也想选择朋友数,因此,我这样做:
SELECT photo, joined, country, sum(CASE WHEN ft = 2 THEN 1 ELSE 0 END) AS numPhotos, sum(CASE WHEN ft = 3 THEN 1 ELSE 0 END) AS numVideos, count(friendship.status) AS numFriends
FROM users
LEFT JOIN posts
ON users.user = posts.author
LEFT JOIN friendship
ON (users.user = friend1 OR users.user = friend2) AND friendship.status = 2
WHERE users.user = 'Igor'
GROUP BY users.user
LIMIT 1
But, the output is: numPhotos = 4
, numVideos = 2
and numFriends = 6
. 但是,输出为:
numPhotos = 4
, numVideos = 2
和numFriends = 6
。
In other words, he is duplicating all results but in numFriends
he's taking the total of posts of Igor (3) and duplicating the value too. 换句话说,他正在复制所有结果,但在
numFriends
他正在获取Igor(3)的帖子总数,并且也在复制值。 And if I change count(friendship.status)
to sum(friendship.status)
the output is: numPhotos = 4
, numVideos = 2
and numFriends = 18
(triples the numFriends
). 如果我将
count(friendship.status)
更改为sum(friendship.status)
则输出为: numPhotos = 4
, numVideos = 2
和numFriends = 18
(将numFriends
三倍)。
I tried too with count(distinct friendship.status)
and the result is: numPhotos = 4
, numVideos = 2
and numFriends = 1
(duplicates the values again as well as return the wrong value 1 for numFriends
that should be 2 knowing he has two friends). 我也尝试过使用
count(distinct friendship.status)
numPhotos = 4
count(distinct friendship.status)
,结果是: numPhotos = 4
, numVideos = 2
和numFriends = 1
(再次复制值以及为numFriends
返回错误的值1,因为他知道2有2朋友)。
So, how I can do this? 那么,我该怎么做呢? (I'm using MySQL)
(我正在使用MySQL)
EDIT : 编辑 :
I changed the count(distinct friendship.status)
to count(distinct friendship.id)
and it worked to select the number of friends. 我改变了
count(distinct friendship.status)
来count(distinct friendship.id)
和它的工作选择的朋友的数量。 But the rest of values ( numPhotos
and numVideos
) continue duplicated. 但是其余值(
numPhotos
和numVideos
)将继续重复。
I discovered that the problem is in ON (users.user = friend1 OR users.user = friend2)
, because if I leave only ON (users.user = friend1)
or ON (users.user = friend2)
the output isn't duplicated. 我发现问题出在
ON (users.user = friend1 OR users.user = friend2)
,因为如果我仅保留ON (users.user = friend1)
或ON (users.user = friend2)
则输出不会重复。 I tried too with ON 'Igor' IN (friend1, friend2) but the result is the same (
numPhotos and
numVideos` continue duplicated). 我也尝试使用
ON 'Igor' IN (friend1, friend2) but the result is the same (
numPhotos and
numVideos`继续重复)。
I think the left join may be joining on a one-to-many relationship, which is causing inflated counts. 我认为左联接可能正在一对多关系上联接,这会导致计数虚高。 Since you are only retrieving the counts for 1 user, I suggest using a subquery to retrieve the friendship counts (for retrieving the counts for multiple users, a derived table may be faster than a subquery):
由于您只检索1个用户的计数,因此我建议使用子查询来检索友谊计数(要检索多个用户的计数,派生表可能比子查询更快):
SELECT
sum(ft = 2) AS numPhotos,
sum(ft = 3) AS numVideos,
(select count(*) from friendships f
where (friend1 = users.user
or friend2 = users.user)
and status = 2) as friendship_count
FROM users
LEFT JOIN posts
ON users.user = posts.author
WHERE users.user = 'Igor'
Note that I removed the group by
because users.user
is already in the where clause, which means there is only 1 group. 请注意,我删除了
group by
原因是因为users.user
已经在where子句中,这意味着只有1个组。
Instead of count(distinct friendship.status)
, try using count(distinct friendship.id)
. 代替
count(distinct friendship.status)
,请尝试使用count(distinct friendship.id)
。 That should give you the number of unique friends. 那应该给您带来许多独特的朋友。 Counting distinct statuses doesn't work because all the statuses will be 2 by definition, so there is only one distinct value.
计数不同的状态是行不通的,因为根据定义,所有状态将为2,因此只有一个不同的值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.