简体   繁体   中英

Hive/SQL How do you access the value of the column which you just computed for previous rows?

I have a table uv_user_date looks like this: 在此处输入图像描述

Its basically a user log in table which shows the cumulative login days partition by user_id. And the column pre show the last login date of a user login record.

Based on this I want to compute the consecutive login days for each user record.

The answer should be: 在此处输入图像描述 My idea is: for a record

  • if(uv_date - pre = 1 day)
    • then consecutive login days is the last consecutive login days + 1
  • else
    • 1 but I am having trouble with accessing the last consecutive login days value.

The Code would be:

SELECT *,
   if(pre = date_add(uv_date, -1), last(consecutive_days) + 1, 1) consecutive_days
FROM uv_user_date

Is there any way to get the value of last(consecutive_days)

First find date difference

tbl1:
select *, 
       if(pre = NULL, 1, datediff(uv_date, pre)) as diff
  from your_table

then difference between cumulative sum of difference and accumulative_uv_date for each user_id, you want to use it as rank

tbl2:
select *, 
sum(diff) over (partition by user_id order by uv_date rows between unbounded preceding and current) - accumulative_uv_date as rnk
   from tbl1

finally, count consecutive days

select user_id, uv_date, rnk
row_number() over (partition by user_id, rnk order by uv_date) as consecutive_days
  from tbl2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM