简体   繁体   中英

Using QUALIFY Row_Number in hive

I'm working with Teradata conversion to Hive (version 0.10.0).

Teradata Query :

QUALIFY ROW_NUMBER() OVER (PARTITION BY ADJSTMNT,SRC_CMN , TYPE_CMD,IOD_TYPE_CD,ROE_PST ,ORDR_SYC,SOR_CD,PROS_ED ORDER BY ADJSTMNT )=1

I did my search and found UDF for Row_Sequence in hive. I also replaced Over Partition with Distribute All and sort By. But I am stuck with QUALIFY.

Any ideas to convert the above to hive are really appreciated and will help us a lot.

a QUALIFY with analytics function (ROW_NUMBER(), SUM(), COUNT(), ... over (partition by ...)) is just a WHERE on a subquery containing the analytics value.

eg:

select A,B,C
from X 
QUALIFY  ROW_NUMBER() over (...) = 1

is equivalent to :

select A,B,C
from (
   select A,B,C, ROW_NUMBER() over (...) as RNUM
   from X
) t
where RNUM = 1

NB: analytics function are available in Hive 0.12

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM