![](/img/trans.png)
[英]SQL query: How to select all following rows every time specific parameter acquires specific value
[英]SQL - How to select x number of rows prior to a specific row
我有這張桌子:
ts | user_id | event |
-------------------------------
1500 a eat
1501 a walk
1502 a sleep
1500 b eat
1501 b sleep
1502 b wake
1500 c walk
1501 c eat
1502 c sit
1503 c sleep
1504 c wake
所以我想要 select x
特定事件之前的行數,假設我想要 select 每個 user_id sleep
之前的 2 個事件。
我的決賽桌結果應該是這樣的:
user_id | event | rank |
--------------------------------
a eat 1
a walk 2
a sleep 3
b NULL 0
b eat 1
b sleep 2
c eat 2
c sit 3
c sleep 4
如何在 SQL 中執行此操作(特別是 Redshift SQl)
嗯。 . . 您可以使用lead()
:
select t.*
from (select t.*,
lead(event) over (partition by user_id order by ts) as next_event,
lead(event, 2) over (partition by user_id order by ts) as next_event2
from t
) t
where 'sleep' in (event, next_event, next_event2);
注意:這只會返回數據中的行。 如果您需要生成行,則需要額外的邏輯。
編輯:
你實際上可以概括這個:
select t.*
from (select t.*,
sum(case when event = 'sleep') over (partition by user_id order by ts rows between current row and 2 following) as cnt_sleep
from t
) t
where cnt_sleep > 0;
這將計算接下來n
行(好吧,n - 1)中“睡眠”的數量。 如果在其中任何一個中找到“睡眠”,它會返回一行。
這是一個間隙和島嶼問題,您需要每個島嶼的第一行和最后兩行。
可能最安全的方法是使用 window 的睡眠事件總和來定義組,然后使用row_number()
進行過濾:
select *
from (
select t.*,
row_number() over(partition by user_id, grp order by ts) rn_asc,
row_number() over(partition by user_id, grp order by ts desc) rn_desc
from (
select t.*,
sum(case when event = 'sleep' then 1 else 0 end)
over(partition by user_id order by ts desc) grp
from mytable t
) t
) t
where (rn_asc = 1 or rn_desc <= 2) and grp > 0
order by user_id, ts
我們用 window 降序排列的“睡眠”事件來定義島嶼。 然后,我們只需按升序和降序枚舉每個島的行,並篩選出我們感興趣的記錄。
ts | user_id | event | grp | rn_asc | rn_desc ---: | :------ | :---- | --: | -----: | ------: 1500 | a | eat | 1 | 1 | 3 1501 | a | walk | 1 | 2 | 2 1502 | a | sleep | 1 | 3 | 1 1500 | b | eat | 1 | 1 | 2 1501 | b | sleep | 1 | 2 | 1 1500 | c | walk | 1 | 1 | 4 1502 | c | sit | 1 | 3 | 2 1503 | c | sleep | 1 | 4 | 1
編輯
Redshift 在 window 函數的order by
子句中需要一個 window 框架。 所以輸入的時間有點長:
select *
from (
select t.*,
row_number() over(
partition by user_id, grp
order by ts rows between unbounded preceding and current row
) rn_asc,
row_number() over(
partition by user_id, grp
order by ts rows between unbounded preceding and current row
) rn_desc
from (
select t.*,
sum(case when event = 'sleep' then 1 else 0 end) over(
partition by user_id
order by ts desc
order by ts rows between unbounded preceding and current row
) grp
from mytable t
) t
) t
where (rn_asc = 1 or rn_desc <= 2) and grp > 0
order by user_id, ts
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.