简体   繁体   中英

Athena get the minimum value in each group and corresponding other column values

Input Table

user id action  date           collection

aaa  1   view   2020-09-01     {some JSON data_1}
aaa  1   view   2020-09-02     {some JSON data_2}
aaa  1   view   2020-09-03     {some JSON data_3}
bbb  2   view   2020-09-08     {some JSON data_22}
bbb  2   view   2020-09-09     {some JSON data_23}
ccc  2   view   2020-09-01     {some JSON data_99}
ddd  3   view   2020-09-01     {some JSON data_88}

Output_Table

user id action  date           collection

aaa  1   view   2020-09-01     {some JSON data_1}
bbb  2   view   2020-09-08     {some JSON data_22}
ccc  2   view   2020-09-01     {some JSON data_99}
ddd  3   view   2020-09-01     {some JSON data_88}

if we see input table and output_table,

i want similar to this

group by (user,id,action) then i need min(date) and corresponding collection value

Can anyone suggest an idea?

One option is to flter with a subquery:

select t.*
from mytable t
where t.date = (
    select min(t1.date) from mytable t1 where t1.user = t.user
)

Another solution is to use window functions to rank records having the same user by date , then use that information to filter the resultset:

select *
from (
    select t.*, row_number() over(partition by user order by date) rn
    from mytable t
) t
where rn = 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM