繁体   English   中英

配置单元窗口功能:上一个分区的最后一个值

[英]Hive window functions: last value of previous partition

使用Hive窗口函数,我想获取上一个分区的最后一个值:

| name | rank | type |
| one  | 1    | T1   |
| two  | 2    | T2   |
| thr  | 3    | T2   |
| fou  | 4    | T1   |
| fiv  | 5    | T2   |
| six  | 6    | T2   |
| sev  | 7    | T2   |

以下查询:

SELECT 
  name, 
  rank, 
  first_value(rank over(partition by type order by rank)) as new_rank 
FROM my_table

将给出:

| name | rank | type | new_rank |
| one  | 1    | T1   |   1      |
| two  | 2    | T2   |   2      |
| thr  | 3    | T2   |   2      |
| fou  | 4    | T1   |   4      |
| fiv  | 5    | T2   |   5      |
| six  | 6    | T2   |   5      |
| sev  | 7    | T2   |   5      |

但是我需要的是“上一个分区的最后一个值”:

| name | rank | type | new_rank |
| one  | 1    | T1   |   NULL   |
| two  | 2    | T2   |   1      |
| thr  | 3    | T2   |   1      |
| fou  | 4    | T1   |   3      |
| fiv  | 5    | T2   |   4      |
| six  | 6    | T2   |   4      |
| sev  | 7    | T2   |   4      |

这似乎很棘手。 这是“群岛”的一种变体。 这是想法:

  1. 标识类型相同的“岛屿”(使用行号的不同)。
  2. 然后使用lag()将先前的等级引入该岛。
  3. 进行最小扫描以获取所需的新排名。

所以:

with gi as (
      select t.*,
             (seqnum - seqnum_t) as grp
      from (select t.*,
                   row_number() over (partition by type order by rank) as seqnum_t,
                   row_number() over (order by rank) as seqnum
            from t
           ) t
      ),
      gi2 as (
       select gi.*, lag(rank) over (order by gi.rank) as prev_rank
       from gi
      )
select gi2.*,
       min(prev_rank) over (partition by type, grp) as new_rank
from gi2
order by rank;

是一个SQL Fiddle(尽管使用Postgres)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM