简体   繁体   English

SQL AWS Athena Group by 无列

[英]SQL AWS Athena Group by Without a Column

I have this dataset我有这个数据集

patient_id   doctor_id   status   created_at
1            1           A        2020-10-01 10:00:00
1            1           P        2020-10-01 10:30:00
1            1           U        2020-10-01 10:35:00
1            2           A        2020-10-01 10:40:00
...

I want to group it by patient_id and doctor_id but without the status is grouped so the result will be like this我想按patient_id和doctor_id对其进行分组,但没有对状态进行分组,因此结果将是这样的

patient_id   doctor_id   status   created_at
1            1           U        2020-10-01 10:35:00
1            2           A        2020-10-01 10:40:00
...

AWS Athena have to grouped all column but I need the last status AWS Athena 必须对所有列进行分组,但我需要最后一个状态

In Athena/Presto you can do this with themax_by function:在 Athena/Presto 中,您可以使用max_by函数执行此max_by

SELECT
  patient_id,
  doctor_id,
  MAX_BY(status, created_at) AS last_status
FROM the_table
GROUP BY 1, 2

max_by(x, y) function returns the value of the column x for the row with the max value of column y of the group. max_by(x, y)函数返回具有组中第y列最大值的行的x列的值。

ROW_NUMBER provides one option here: ROW_NUMBER在这里提供了一个选项:

WITH cte AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY patient_id, doctor_id ORDER BY created_at DESC) rn
    FROM yourTable
)

SELECT patient_id, doctor_id, status, created_at
FROM cte
WHERE rn = 1
ORDER BY patient_id, doctor_id;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM