简体   繁体   中英

Hive window functions: last value of previous partition

Using Hive window functions, I would like to get the last value of the previous partition:

| name | rank | type |
| one  | 1    | T1   |
| two  | 2    | T2   |
| thr  | 3    | T2   |
| fou  | 4    | T1   |
| fiv  | 5    | T2   |
| six  | 6    | T2   |
| sev  | 7    | T2   |

Following query:

SELECT 
  name, 
  rank, 
  first_value(rank over(partition by type order by rank)) as new_rank 
FROM my_table

Would give:

| name | rank | type | new_rank |
| one  | 1    | T1   |   1      |
| two  | 2    | T2   |   2      |
| thr  | 3    | T2   |   2      |
| fou  | 4    | T1   |   4      |
| fiv  | 5    | T2   |   5      |
| six  | 6    | T2   |   5      |
| sev  | 7    | T2   |   5      |

But what I need is "the last value of the previous partition":

| name | rank | type | new_rank |
| one  | 1    | T1   |   NULL   |
| two  | 2    | T2   |   1      |
| thr  | 3    | T2   |   1      |
| fou  | 4    | T1   |   3      |
| fiv  | 5    | T2   |   4      |
| six  | 6    | T2   |   4      |
| sev  | 7    | T2   |   4      |

This seems quite tricky. This is a variant of group-and-islands. Here is the idea:

  1. Identify the "islands" where type is the same (using difference of row numbers).
  2. Then use lag() to introduce the previous rank into the island.
  3. Do a min scan to get the new rank that you want.

So:

with gi as (
      select t.*,
             (seqnum - seqnum_t) as grp
      from (select t.*,
                   row_number() over (partition by type order by rank) as seqnum_t,
                   row_number() over (order by rank) as seqnum
            from t
           ) t
      ),
      gi2 as (
       select gi.*, lag(rank) over (order by gi.rank) as prev_rank
       from gi
      )
select gi2.*,
       min(prev_rank) over (partition by type, grp) as new_rank
from gi2
order by rank;

Here is a SQL Fiddle (albeit using Postgres).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM