简体   繁体   中英

Aggregate Sliding Window for hive

I have a hive table which is in sorted order based on a numeric value say count.

fruit   count
------  -------
apple   10
orange  8
banana  5
melon   3
pears   1

The total count is 27. I need it divided into three segments. So first 1/3 of count ie 1 to 9 is one, 10 to 18 is second and 19 to 27 is third. I guess I need to do some sought of sliding window.

fruit   count    zone
------  ------- --------
apple   10      one
orange  8       two
banana  5       three
melon   3       three
pears   1       three

Any idea how to approach this

In SQL way:

select *,
(
sum(count)  over (partition by 1 order by count desc) /*<---this line for return running totals*/
/(sum(count) over (partition by 1) /3) /*<-- divided total count into 3 group. In your case this is 9 for each zone value.*/
) /*<--using running totals divided by zone value*/
+ /*<-- 11 / 9 = 1 ... 2  You must plus 1 with quotient to let 11 in the right zone.Thus,I use this + operator  */
(
case when 
(
sum(count)  over (partition by 1 order by count desc)
%(sum(count) over (partition by 1) /3) /*<--calculate remainder */
) >1 then 1 else 0 end /*<--if remainder>1 then the zone must +1*/
)  as zone
from yourtable

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM