簡體   English   中英

SQL - 如何 select x 特定行之前的行數

[英]SQL - How to select x number of rows prior to a specific row

我有這張桌子:

   ts  |  user_id  |   event   |  
-------------------------------
 1500        a         eat 
 1501        a         walk 
 1502        a         sleep 
 1500        b         eat 
 1501        b         sleep 
 1502        b         wake
 1500        c         walk 
 1501        c         eat
 1502        c         sit
 1503        c         sleep 
 1504        c         wake 

所以我想要 select x特定事件之前的行數,假設我想要 select 每個 user_id sleep之前的 2 個事件。

我的決賽桌結果應該是這樣的:

user_id  |   event   |   rank  |
--------------------------------
    a         eat         1
    a         walk        2
    a         sleep       3
    b         NULL        0
    b         eat         1
    b         sleep       2
    c         eat         2
    c         sit         3
    c         sleep       4

如何在 SQL 中執行此操作(特別是 Redshift SQl)

嗯。 . . 您可以使用lead()

select t.*
from (select t.*,
             lead(event) over (partition by user_id order by ts) as next_event,
             lead(event, 2) over (partition by user_id order by ts) as next_event2
      from t
     ) t
where 'sleep' in (event, next_event, next_event2);

注意:這只會返回數據中的行。 如果您需要生成行,則需要額外的邏輯。

編輯:

你實際上可以概括這個:

select t.*
from (select t.*,
             sum(case when event = 'sleep') over (partition by user_id order by ts rows between current row and 2 following) as cnt_sleep
      from t
     ) t
where cnt_sleep > 0;

這將計算接下來n行(好吧,n - 1)中“睡眠”的數量。 如果在其中任何一個中找到“睡眠”,它會返回一行。

這是一個間隙和島嶼問題,您需要每個島嶼的第一行和最后兩行。

可能最安全的方法是使用 window 的睡眠事件總和來定義組,然后使用row_number()進行過濾:

select *
from (
    select t.*, 
        row_number() over(partition by user_id, grp order by ts) rn_asc,
        row_number() over(partition by user_id, grp order by ts desc) rn_desc
    from (
        select t.*,
            sum(case when event = 'sleep' then 1 else 0 end) 
                over(partition by user_id order by ts desc)  grp
        from mytable t
    ) t
) t
where (rn_asc = 1 or rn_desc <= 2) and grp > 0
order by user_id, ts

我們用 window 降序排列的“睡眠”事件來定義島嶼。 然后,我們只需按升序和降序枚舉每個島的行,並篩選出我們感興趣的記錄。

DB Fiddle 演示

  ts | user_id | event | grp | rn_asc | rn_desc
---: | :------ | :---- | --: | -----: | ------:
1500 | a       | eat   |   1 |      1 |       3
1501 | a       | walk  |   1 |      2 |       2
1502 | a       | sleep |   1 |      3 |       1
1500 | b       | eat   |   1 |      1 |       2
1501 | b       | sleep |   1 |      2 |       1
1500 | c       | walk  |   1 |      1 |       4
1502 | c       | sit   |   1 |      3 |       2
1503 | c       | sleep |   1 |      4 |       1

編輯

Redshift 在 window 函數的order by子句中需要一個 window 框架。 所以輸入的時間有點長:

select *
from (
    select t.*, 
        row_number() over(
            partition by user_id, grp 
            order by ts rows between unbounded preceding and current row
        ) rn_asc,
        row_number() over(
            partition by user_id, grp 
            order by ts rows between unbounded preceding and current row
        ) rn_desc
    from (
        select t.*,
            sum(case when event = 'sleep' then 1 else 0 end) over(
                partition by user_id 
                order by ts desc
                order by ts rows between unbounded preceding and current row
            )  grp
        from mytable t
    ) t
) t
where (rn_asc = 1 or rn_desc <= 2) and grp > 0
order by user_id, ts

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM