简体   繁体   中英

COUNT(DISTINCT()) OVER (PARTITION BY) in Presto (Athena)?

I have the following table in an Athena (Presto) DB, let's call it table1 (simplified for this question):

  | serverdate | colA | colB | colC | hash
  |-------------------------------------------
0 | 2019-12-01 |    1 | OK   | 10   | hash1
1 | 2019-12-02 |    2 | OK   | 10   | hash2
2 | 2019-12-02 |    3 | NOK  | 100  | hash3
3 | 2019-12-01 |    3 | OK   | 0    | hash4
4 | 2019-12-03 |    6 | OK   | 1    | hash5
5 | 2019-12-05 |    8 | NOK  | 0    | hash6
6 | 2019-12-06 |    8 | NOK  | 0    | hash6

The following query is used to count how many distinct "hash" are in the table:

SELECT 'users' AS Type, round(count(DISTINCT hash)) AS uu
FROM table1

This is later used as a subquery, but it is not important for this question. In this example, result should be:

  |  type | uu
  |-------------
0 | users | 6

What I want : I want to do this same counting, but grouping the results by colA . My result should be something like this:

  | colA | counthash
  |------------------
0 |    1 | 1
1 |    2 | 1
2 |    3 | 2
3 |    6 | 1
4 |    8 | 1

I think in using COUNT(DISTINCT(hash)) OVER (PARTITION BY colA), but as far as I know, COUNT(DISTINCT()) is not allowed as a window function in Presto.

Any ideas in how to do this? Thanks.

You shouldn't need a window function for this. I'm not too familiar with presto per se, but does the following work?:

SELECT colA, round(count(DISTINCT hash)) AS uu
FROM table1
GROUP BY colA;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM