[英]Left joins --> aggregate function problem
I have four different tables in my database: 我的数据库中有四个不同的表:
thread: 线:
thread_rating: thread_rating:
thread_report: thread_report:
thread_impression: thread_impression:
And I'm going to join on these tables with this SQL-Query 我将使用此SQL-Query加入这些表
SELECT t.thread_id,
t.thread_content,
SUM(tra.liked) AS liked,
SUM(tra.disliked) AS disliked,
t.timestamp,
((100*(tra.liked + SUM(tra.liked))) / (tra.liked + SUM(tra.liked) + (tra.disliked + SUM(tra.disliked)))) AS liked_percent,
((100*(COUNT(DISTINCT tre.thread_report_id)) / ((COUNT(DISTINCT ti.thread_impression_id))))) AS reported_percent
FROM thread AS t
LEFT JOIN thread_rating AS tra ON t.thread_id = tra.thread_id
LEFT JOIN thread_report AS tre ON tra.thread_id = tre.thread_id
LEFT JOIN thread_impression AS ti ON tre.thread_id = ti.thread_id
GROUP BY t.thread_id
ORDER BY liked_percent
The Query should return all thread_ids with the calculated liked and disliked, the likes in percent, the timestamp, when the thread was inserted into the database and the reports in percent to the impressions (the times, the thread was shown to the user)... Query应该返回所有thread_id,其中包含计算出的喜欢和不喜欢的内容,以百分比表示的喜欢,时间戳,线程插入数据库的时间以及报告的百分比(展示次数,时间,线程都显示给用户)。 ..
Nearly all results are right, the only results which are not right are the likes and dislikes. 几乎所有结果都是正确的,唯一不合适的结果是喜欢和不喜欢。
If I put a count(*) in front of the query, I can see, that the right results have a count of 1 and the wrong ones have sometimes a count of up to 60. Seems like there are cross join-problems... 如果我在查询前面加上一个count(*),我可以看到,正确的结果计数为1,错误的结果有时计数最多为60.看起来有交叉连接问题.. 。
I think that this is an issue with the Grouping, or perhaps I should embrace the Joins. 我认为这是分组的问题,或者我应该接受联接。
I've seen solutions with subselects. 我见过带有子选择的解决方案。 But I don't think that this is a great solutions for this issue...
但我认为这不是解决这个问题的好方法......
What am I doing wrong here? 我在这做错了什么?
The tra
table has multiple records per thread_id. tra
表每个thread_id有多个记录。 This caused double counts in the SUM
function. 这导致
SUM
函数中的双重计数。
Do the summations in a subselect, grouped by the join field. 在子选择中进行求和,按连接字段分组。
That way you will only have one thread_id in tra2
to join with and duplicate rows will be avoided. 这样,你只需要在
tra2
有一个thread_id加入,并避免重复行。
SELECT t.thread_id,
t.thread_content,
tra2.liked
tra2.disliked,
t.timestamp,
tra2.liked_percent,
((100*(COUNT(DISTINCT tre.thread_report_id)) / ((COUNT(DISTINCT ti.thread_impression_id))))) AS reported_percent
FROM thread AS t
LEFT JOIN (
SELECT
tra.thread_id
, SUM(tra.liked) AS liked
, SUM(tra.disliked) AS disliked
, ((100*(tra.liked + SUM(tra.liked))) / (tra.liked + SUM(tra.liked) + (tra.disliked + SUM(tra.disliked)))) AS liked_percent
FROM thread_rating AS tra
GROUP BY tra.thread_id
) as tra2 ON t.thread_id = tra2.thread_id
LEFT JOIN thread_report AS tre ON tra.thread_id = tre.thread_id
LEFT JOIN thread_impression AS ti ON tre.thread_id = ti.thread_id
GROUP BY t.thread_id
ORDER BY liked_percent DESC
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.