繁体   English   中英

Athena SQL - 使用外列的自联接子查询

[英]Athena SQL - Self-join subquery using outer column

所以我在 S3 存储桶中对数据进行了编目,它与此处的数据非常相似:

+-----+-----------+---------+-----------------------+
| id  | title     | event   | time                  |
+-----+-----------+---------+-----------------------+
|1    | book A    | BORROW  | 2018-07-01 09:00:00   |
|1    | book A    | RETURN  | 2018-08-01 09:00:00   |
|2    | book B    | BORROW  | 2018-08-01 13:00:00   |
|2    | book B    | RETURN  | 2018-10-01 17:00:00   |
|1    | book A    | BORROW  | 2018-11-01 09:00:00   |
|1    | book A    | RETURN  | 2018-12-01 09:00:00   |
+-----+-----------+---------------------------------+

我基本上希望能够在 Amazon Athena 中编写一个 SELECT 语句,该语句连续显示借用和归还时间,如下所示:

+-----+-----------+-----------------------+-----------------------+
| id  | title     | borrow_time           | return_time           |
+-----+-----------+-----------------------+-----------------------+
|1    | book A    | 2018-07-01 09:00:00   | 2018-08-01 09:00:00   |
|2    | book B    | 2018-08-01 13:00:00   | 2018-10-01 17:00:00   |
|1    | book A    | 2018-11-01 09:00:00   | 2018-12-01 09:00:00   |
+-----+-----------+-----------------------+-----------------------+

我一直在花费大量时间编写大约 5 个不同的查询(使用诸如OUTER APPLY类的东西,但 Athena 似乎对使用非常敏感,特别是考虑到它没有任何OUTER APPLY功能。这是逻辑我的最新声明:

SELECT b.id,
       b.title,
       b.time AS borrow_time,
       MIN(r.time) AS return_time
FROM (
      SELECT id,
             title,
             time
      FROM books
      WHERE event = 'BORROW'
     ) b
OUTER JOIN (
            SELECT id,
                   time
            FROM books
            WHERE event = 'RETURN'
           ) r
        ON b.id = r.id
       AND b.time < r.time
GROUP BY b.id,
         b.title,
         borrow_time
ORDER BY borrow_time;

任何解决这个问题的想法将不胜感激!

假设借用和返回都是成对的,您可以枚举它们,然后使用条件聚合:

select id, title,
       max(case when event = 'BORROW' then b.time end) as borrow_time,
       max(case when event = 'RETURN' then b.time end) as return_time
from (select b.*,
             row_number() over (partition by b.id, b.event order by b.time) as sequm
      from books b
     ) b
group by id, title, seqnum
order by id, title, seqnum;

试试 CASE WHEN 和 row_number() 函数:

with pcte as 
(
 SELECT id,
           title,event,time, row_number() over(order by id,title,event) as rn
           FROM books
)
    SELECT id,
           title,
           case when event = 'BORROW' then b.time end AS borrow_time,
           case when event = 'RETURN' then b.time end AS return_time
     FROM pcte order by id, title, rn

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM