[英]COUNT(DISTINCT()) OVER (PARTITION BY) in Presto (Athena)?
I have the following table in an Athena (Presto) DB, let's call it table1 (simplified for this question):我在 Athena (Presto) DB 中有下表,我们称之为 table1(针对这个问题进行了简化):
| serverdate | colA | colB | colC | hash
|-------------------------------------------
0 | 2019-12-01 | 1 | OK | 10 | hash1
1 | 2019-12-02 | 2 | OK | 10 | hash2
2 | 2019-12-02 | 3 | NOK | 100 | hash3
3 | 2019-12-01 | 3 | OK | 0 | hash4
4 | 2019-12-03 | 6 | OK | 1 | hash5
5 | 2019-12-05 | 8 | NOK | 0 | hash6
6 | 2019-12-06 | 8 | NOK | 0 | hash6
The following query is used to count how many distinct "hash" are in the table:以下查询用于计算表中有多少不同的“哈希”:
SELECT 'users' AS Type, round(count(DISTINCT hash)) AS uu
FROM table1
This is later used as a subquery, but it is not important for this question.这稍后用作子查询,但对于此问题并不重要。 In this example, result should be:
在这个例子中,结果应该是:
| type | uu
|-------------
0 | users | 6
What I want : I want to do this same counting, but grouping the results by colA .我想要什么:我想做同样的计数,但按colA对结果进行分组。 My result should be something like this:
我的结果应该是这样的:
| colA | counthash
|------------------
0 | 1 | 1
1 | 2 | 1
2 | 3 | 2
3 | 6 | 1
4 | 8 | 1
I think in using COUNT(DISTINCT(hash)) OVER (PARTITION BY colA), but as far as I know, COUNT(DISTINCT()) is not allowed as a window function in Presto.我认为在使用 COUNT(DISTINCT(hash)) OVER (PARTITION BY colA) 时,但据我所知, COUNT(DISTINCT()) 在 Presto 中不允许作为窗口函数。
Any ideas in how to do this?关于如何做到这一点的任何想法? Thanks.
谢谢。
You shouldn't need a window function for this.您不应该为此需要窗口函数。 I'm not too familiar with presto per se, but does the following work?:
我对 presto 本身不太熟悉,但以下方法是否有效?:
SELECT colA, round(count(DISTINCT hash)) AS uu
FROM table1
GROUP BY colA;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.