[英]Using rank to calculate various averages
我的数据看起来像这样
+------+--------+------+-------+
| year | month | name | value |
+------+--------+------+-------+
| 2017 | 1 | John | 100 |
| 2017 | 2 | Doe | 200 |
| 2017 | 3 | Jane | 300 |
| . | . | . | . |
| 2018 | 1 | John | 150 |
| 2018 | 2 | Doe | 250 |
| 2018 | 3 | Jane | 350 |
+------+--------+------+-------+
我正在尝试计算每年和每月的前 2 个名称的平均值。 我可以用下面的代码做到这一点
select year, month, avg(sum_value) as avg_of_2
from (
select year,
month,
name,
sum(value) as sum_value,
rank() over (partition by year, month order by sum(value) desc) as rnk
from database.table_a
group by year, month, name
order by 1, 2, 4 desc
) tbl_for_2
where rnk <= 2 -- for top 2 values
group by 1, 2
order by 1, 2;
但是现在我想将前 2 个名称的平均值扩展到前 5、10 和 50 个。有没有一种方法可以使用排名而不重复相同的查询?
我的最终结果就像
+------+-------+----------+----------+---------+
| year | month | avg_2 | avg_5 | avg_10 |
+------+-------+----------+----------+---------+
| 2017 | 1 | some_val | some_val | som_val |
| 2017 | 2 | some_val | some_val | som_val |
| .. | | | | |
| .. | | | | |
+------+-------+----------+----------+---------+
无法返回动态列数,必须写入所有列。 只需在外部查询中使用过滤聚合:
select year, month,
AVG(sum_value)FILTER(WHERE rnk<=2) as avg_2,
AVG(sum_value)FILTER(WHERE rnk<=5) as avg_5,
AVG(sum_value)FILTER(WHERE rnk<=10) as avg_10,
.................
AVG(sum_value)FILTER(WHERE rnk<=100) as avg_100,
... and so on
from (
select year,
month,
name,
sum(value) as sum_value,
rank() over (partition by year, month order by sum(value) desc) as rnk
from database.table_a
group by year, month, name
) tbl
group by 1, 2
order by 1, 2;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.