hive 中增加時間圖的行號

Question

我有這樣的表：

col1    col2
1      2020-01-15
1      2020-01-16
1      2020-01-17
1      2020-01-18
1      2020-01-20
2      2020-01-09
2      2020-01-10
2      2020-01-15

我正在計算這樣的排名

select
    col1,
    col2,
    date_sub(col2, -row_number() over (partition by col1 order by col2)) as rnk
  from myTable

並獲得排名

col1    col2        rnk
1      2020-01-15   1
1      2020-01-16   2
1      2020-01-17   3
1      2020-01-18   4
1      2020-01-20   5
2      2020-01-09   1
2      2020-01-10   2
2      2020-01-15   3

但我需要這樣的排名

col1    col2        rnk
1      2020-01-15   1
1      2020-01-16   2
1      2020-01-17   3
1      2020-01-18   4
1      2020-01-20   1
2      2020-01-09   1
2      2020-01-10   2
2      2020-01-15   1

每當用戶 1 的連續日期變化（如 18 和 20）不連續時發生變化，因此可以更改我不知道如何實現這一點

Answer 1

您可以使用：

select t.*,
       row_number() over (partition by col1, date_add(col2, - seqnum) order by col2) as rank
from (select t.*, row_number() over (partition by col1 order by col2) as seqnum
      from t
     ) t;

外部row_number()從col2中減去一個序列。 當相鄰值中沒有間隙時，結果是一個常數。 因此，差異定義了相鄰記錄的“孤島”。

hive 中增加時間圖的行號

問題描述

1 個解決方案

解決方案1
1 已采納 2020-05-08 10:42:17

hive 中增加時間圖的行號

問題描述

1 個解決方案

解決方案1 1 已采納 2020-05-08 10:42:17

解決方案1
1 已采納 2020-05-08 10:42:17