How to get the maximum row_number() for each group in hive sql

Question

Using the row_number() in the hive SQL I can filter the duplicates/pick the first instance of an id by selecting 1 in the where clause as below. What I need here is how can I find the last instance in each group.

select * from 
(select c1,c2,c3,c4,c5,id, row_number() over(partition by id ORDER BY id) as seq
from 
table) as cnt where seq = 1;

My requirement is, for example, if the id 1212 has 3 instances and 1313 has 5 instances in the table something like below, I can use the above query and get only one instance by selecting 1 in the where clause. But I want 3 for the id 1212 and 5 for the id 1313 in the below.

 c1,  c2,  c3,  c4,  c5,  ID     seq
2020 2020 2020 2020 2020 1212     1
2021 2020 2021 2020 2021 1212     2
2022 2020 2022 2020 2022 1212     3
2023 2020 2023 2020 2023 1313     1
2024 2020 2024 2020 2024 1313     2
2025 2020 2025 2020 2025 1313     3
2026 2020 2026 2020 2026 1313     4
2026 2020 2026 2020 2026 1313     5

Answer 1

Add an extra column with COUNT(*) OVER (PARTITION BY id) AS cnt . That will contain the number of rows in the group which is also the maximum ROW_NUMBER value for the group as well.

Answer 2

select id,max(seq) over(partition by id ORDER BY id)from 
(select *, row_number() over(partition by id ORDER BY id) as seq
from 
table)maxseq
group by id

Answer 3

Use all those columns in the group by and use max on the row_number()

select c1,c2,c3,c4,c5,id,max(r_no) 
from 
(
    select c1,c2,c3,c4,c5,id, row_number() over (partition by id ORDER BY c1,c2,c3,c4,c5,id) as r_no
    from 
    table
) a
group by c1,c2,c3,c4,c5,id

Answer 4

Change the ascending sort to a descending sort:

select t.* 
from (select c1, c2, c3, c4, c5, id,
             row_number() over (partition by id ORDER BY id desc) as seqnum
------------------------------------------------------------^
      from table
    ) t
where seqnum = 1;

How to get the maximum row_number() for each group in hive sql

Question

4 answers

solution1
2 2018-06-28 14:55:20

solution2
1 ACCPTED 2018-06-28 15:05:41

solution3
1 2018-06-28 15:28:59

solution4
1 2018-06-28 15:41:53

How to get the maximum row_number() for each group in hive sql

Question

4 answers

solution1 2 2018-06-28 14:55:20

solution2 1 ACCPTED 2018-06-28 15:05:41

solution3 1 2018-06-28 15:28:59

solution4 1 2018-06-28 15:41:53

solution1
2 2018-06-28 14:55:20

solution2
1 ACCPTED 2018-06-28 15:05:41

solution3
1 2018-06-28 15:28:59

solution4
1 2018-06-28 15:41:53