简体   繁体   中英

Using row_number update column value using previous row value

I have a table in hive contents are below

+---+----------+----------+----------+--------+
| pk|    from_d|      to_d| load_date| row_num|
+---+----------+----------+----------+--------+
|111|2019-03-03|2019-03-03|2019-03-03|       1|
|111|2019-02-02|2019-02-02|2019-02-02|       2|
|111|2019-01-01|2019-01-01|2019-01-01|       3|
|222|2019-03-03|2019-03-03|2019-03-03|       1|
|222|2019-01-01|2019-01-01|2019-01-01|       2|
|333|2019-02-02|2019-02-02|2019-02-02|       1|
|333|2019-01-01|2019-01-01|2019-01-01|       2|
|444|2019-02-02|2019-02-02|2019-02-02|       1|
|555|2019-03-03|2019-03-03|2019-03-03|       1|
+---+----------+----------+----------+--------+

Now I want to update the to_d column where row_num > 1 using some conditions like below

when row_num = 2 then to_d column should have row_num 1 row's from_d - 1 day 
when row_num = 3 then to_d column should have row_num 2 row's from_d - 1 day
and so on

if row_num =1 then to_d should not be updated

Please don't get me wrong I am just trying to help one of the other user's question Updating column values based on the other table values in hive tables

I was trying this approach but I am unable proceed further from here

You can use LAG function like below

select pk, from_d, 
          case when row_num = 1 then to_d else date_sub(lag(to_d) over (), 1) end as to_d, 
          row_num from table;

This will give you the desired result

+---+----------+----------+-------------------+
| pk|    from_d|      to_d|row_number_window_0|
+---+----------+----------+-------------------+
|111|2019-03-03|2019-03-03|                  1|
|111|2019-02-02|2019-03-02|                  2|
|111|2019-01-01|2019-02-01|                  3|
|222|2019-03-03|2019-03-03|                  1|
|222|2019-01-01|2019-03-02|                  2|
|333|2019-02-02|2019-02-02|                  1|
|333|2019-01-01|2019-02-01|                  2|
|444|2019-02-02|2019-02-02|                  1|
|555|2019-03-03|2019-03-03|                  1|
+---+----------+----------+-------------------+

You can use IF condition to sort the problem... Let me give example. select from_d,to_d,row_num, IF(row_num=1,1,row_num-1)... so for whenever row_num is greater than 1 your row_num is subtracted by 1. Hope you satisfied.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM