hive 中增加时间图的行号

Question

I have table like this:我有这样的表：

col1    col2
1      2020-01-15
1      2020-01-16
1      2020-01-17
1      2020-01-18
1      2020-01-20
2      2020-01-09
2      2020-01-10
2      2020-01-15

and i am calcuating rank like this我正在计算这样的排名

select
    col1,
    col2,
    date_sub(col2, -row_number() over (partition by col1 order by col2)) as rnk
  from myTable

and getting rank并获得排名

col1    col2        rnk
1      2020-01-15   1
1      2020-01-16   2
1      2020-01-17   3
1      2020-01-18   4
1      2020-01-20   5
2      2020-01-09   1
2      2020-01-10   2
2      2020-01-15   3

but i need rank like this但我需要这样的排名

col1    col2        rnk
1      2020-01-15   1
1      2020-01-16   2
1      2020-01-17   3
1      2020-01-18   4
1      2020-01-20   1
2      2020-01-09   1
2      2020-01-10   2
2      2020-01-15   1

changing whenever consecutive date change like 18 and 20 are not consecutive for user 1 so it is supoose to change I am not sure how to achieve this每当用户 1 的连续日期变化（如 18 和 20）不连续时发生变化，因此可以更改我不知道如何实现这一点

Answer 1

You can use:您可以使用：

select t.*,
       row_number() over (partition by col1, date_add(col2, - seqnum) order by col2) as rank
from (select t.*, row_number() over (partition by col1 order by col2) as seqnum
      from t
     ) t;

The outer row_number() subtracts a sequence from col2 .外部row_number()从col2中减去一个序列。 The result is a constant when there are no gaps in adjacent values.当相邻值中没有间隙时，结果是一个常数。 So, the difference defines the "islands" of adjacent records.因此，差异定义了相邻记录的“孤岛”。

hive 中增加时间图的行号

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-05-08 10:42:17

hive 中增加时间图的行号

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-05-08 10:42:17

解决方案1
1 已采纳 2020-05-08 10:42:17