[英]Calculating percentages with GROUP BY query
我有一个包含 3 列的表格,如下所示:
File User Rating (1-5)
------------------------------
00001 1 3
00002 1 4
00003 2 2
00004 3 5
00005 4 3
00005 3 2
00006 2 3
Etc.
我想生成一个输出以下内容的查询(对于每个用户和评级,显示文件数以及文件百分比):
User Rating Count Percentage
-----------------------------------
1 1 3 .18
1 2 6 .35
1 3 8 .47
2 5 12 .75
2 3 4 .25
使用 Postgresql,我知道如何使用以下查询创建包含前 3 列的查询,但我不知道如何计算 GROUP BY 中的百分比:
SELECT
User,
Rating,
Count(*)
FROM
Results
GROUP BY
User, Rating
ORDER BY
User, Rating
在这里,我希望百分比计算适用于每个用户/评级组。
WITH t1 AS
(SELECT User, Rating, Count(*) AS n
FROM your_table
GROUP BY User, Rating)
SELECT User, Rating, n,
(0.0+n)/(COUNT(*) OVER (PARTITION BY User)) -- no integer divide!
FROM t1;
要么
SELECT User, Rating, Count(*) OVER w_user_rating AS n,
(0.0+Count(*) OVER w_user_rating)/(Count(*) OVER (PARTITION BY User)) AS pct
FROM your_table
WINDOW w_user_rating AS (PARTITION BY User, Rating);
我会看看其中一个或另一个是否使用适合您的 RDBMS 的工具产生更好的查询计划。
或者,您可以采用老派的方式——可以说更容易理解:
select usr.User as User ,
usr.Rating as Rating ,
usr.N as N ,
(100.0 * usr.N) / total.N as Pct
from ( select User, Rating , count(*) as N
from Results
group by User , Rating
) usr
join ( select User , count(*) as N
from Results
group by User
) total on total.User = usr.User
order by usr.User, usr.Rating
干杯!
最好的方法是使用window 函数。
在 TSQL 这应该可以工作
SELECT
User,
Rating,
Count(*), SUM(COUNT(*)) OVER (PARTITION BY User, Rating ORDER BY User, Rating) AS Total,
Count(*)/(SUM(COUNT(*)) OVER (PARTITION BY User, Rating ORDER BY User, Rating)) AS Percentage
FROM
Results
GROUP BY
User, Rating
ORDER BY
User, Rating
WITH data AS
(SELECT User, Rating, Count(*) AS Count
FROM Results
GROUP BY User, Rating)
SELECT User, Rating, Count,
(0.0+n)/(SUM(Count) OVER (PARTITION BY User))
FROM data;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.