[英]SQL query for aggregation multiple columns
I would like to write a query in Presto SQL.我想在 Presto SQL 中编写一个查询。
The table:桌子:
words![]() |
id1 ![]() |
id2 ![]() |
id2_like ![]() |
rank![]() |
---|---|---|---|---|
baseball![]() |
28 ![]() |
2756 ![]() |
1. ![]() |
6 ![]() |
baseball![]() |
28 ![]() |
3180. ![]() |
0. ![]() |
5 ![]() |
baseball![]() |
28. ![]() |
8161. ![]() |
0. ![]() |
17 ![]() |
baseball![]() |
11. ![]() |
1723 ![]() |
0. ![]() |
22 ![]() |
baseball![]() |
11. ![]() |
5329. ![]() |
1. ![]() |
29 ![]() |
football.![]() |
19. ![]() |
3210. ![]() |
1. ![]() |
2 ![]() |
football.![]() |
19. ![]() |
5519 ![]() |
0. ![]() |
18 ![]() |
football.![]() |
19. ![]() |
6257 ![]() |
1. ![]() |
3 ![]() |
id2_like
depends on id2
and it can only be 1 or 0. id2_like
取决于id2
,它只能是 1 或 0。
I would like to get some aggregation results from the above table within one SQL query.我想在一个 SQL 查询中从上表中获得一些聚合结果。
For each value in words
, we need to get对于
words
中的每个值,我们需要得到
id2_like = 1
id2_like = 1
id2_like
as 0 out of total id2_like
id2_like
id2_like
的百分比为 0id1
where id2_like = 0
id2_like = 0
的id1
的数量id1
the max rank of id2_like = 0
id1
上平均id2_like = 0
的最大排名id1
(in case some id2_like = 1
and some are 0) id1
的平均百分比(以防某些id2_like = 1
而某些为 0) I know how to develop query for each one but I am not sure how to get all of them within one single SQL query.我知道如何为每个查询开发查询,但我不确定如何在一个 SQL 查询中获取所有查询。
Expected results:预期成绩:
words. id1_cnt_for_id2_as_1 perc_id2_as_0 id1_cnt_for_id2_as_0_perc. max_rank_id2_as_0 avg_perc_id2_as_0
baseball 2 3/5 2 (17+22)/2 (2/3+1/2)/2
football. 2. 2/3. 1. 18 1/3.
If I understand correctly here is what you want, however I didn't understand what you want for number 5如果我理解正确,这就是你想要的,但是我不明白你想要 5 号
select words
, sum(id1_cnt_for_id2_as_1) as id1_cnt_for_id2_as_1
, sum(sum_perc_id2_as_0)* 100.0 /sum(cnt_perc_id2_as_0) as perc_id2_as_0
, sum(id1_cnt_for_id2_as_0_perc) id1_cnt_for_id2_as_0_perc
, avg(max_rank_id2_as_0) as max_rank_id2_as_0
, avg(avg_perc_id2_as_0) as avg_perc_id2_as_0
from (
select words
, sum(id2_like) as id1_cnt_for_id2_as_1
, sum(case when id2_like= 0 then 1 end) as sum_perc_id2_as_0
, count(*) as cnt_perc_id2_as_0
, count(distinct case when id2_like =0 then id1 end) id1_cnt_for_id2_as_0_perc
, sum(case when id2_like= 0 then rank end) as max_rank_id2_as_0
, sum(case when id2_like= 0 then 1 end)* 100.0/count(*) as avg_perc_id2_as_0
from data
group by words,id1
) t group by words
Hope it helps you to get some idea of what to do, tested in AWS Athena (pretty much like presto under the hood).希望它能帮助您了解要做什么,在 AWS Athena 中进行测试(非常类似于引擎盖下的 presto)。 Did not understood the fifth question.
第五题没看懂。
SELECT
words,
item_1,
item_1 / CAST(size as decimal(10,4)) * 100 as item_2,
size - item_1 as item_3,
max_rank as item_4
FROM (
SELECT
words,
SUM(id2_like) as item_1,
COUNT(*) as size,
AVG(id1/CAST((SELECT MAX(rank) FROM tb WHERE id2_like = 0) as decimal(10,4))) as max_rank
FROM tb
GROUP BY 1
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.