I have a requirement in redshift where I need to combine result if the data are continuous. I have the following table, where user_id, product_id are varchar and login_time, log_out_time are timestamp.
user_id product_id login_time log_out_time
----------------------------------------------------------------------
ashok facebook 1/1/2017 1:00:00 AM 1/1/2017 2:00:00 AM
ashok facebook 1/1/2017 2:00:00 AM 1/1/2017 3:00:00 AM
ashok facebook 1/1/2017 3:00:00 AM 1/1/2017 4:00:00 AM
ashok linked_in 1/1/2017 5:00:00 AM 1/1/2017 6:00:00 AM
ashok linked_in 1/1/2017 6:00:00 AM 1/1/2017 7:00:00 AM
ashok facebook 1/1/2017 8:00:00 AM 1/1/2017 9:00:00 AM
ram facebook 1/1/2017 9:00:00 AM 1/1/2017 10:00:00 AM
ashok linked_in 1/1/2017 7:00:00 AM 1/1/2017 8:00:00 AM
I need to combine the result if the data are continuous for a given user_id for each product. So my output should looks like,
user_id product_id login_time log_out_time
----------------------------------------------------------------------
ashok facebook 1/1/2017 1:00:00 AM 1/1/2017 4:00:00 AM
ashok facebook 1/1/2017 8:00:00 AM 1/1/2017 9:00:00 AM
ashok linked_in 1/1/2017 5:00:00 AM 1/1/2017 8:00:00 AM
ram facebook 1/1/2017 9:00:00 AM 1/1/2017 10:00:00 AM
I tried with the following query but it doesn't helped me,
SELECT user_id, product_id, MIN(login_time), MAX(log_out_time) FROM TABLE_NAME GROUP BY user_id, product_id
Above query fails to give my required output since it doesn't have the logic to check the data are in continuous time. I need to have a query for this without using any custom function, but I am allowed to use any redshift in-built function.
You can use lag()
to identify where groups start, then cumulative sum to identify the groups, then group by
to aggregate the results:
select user_id, product_id, min(login_time), max(log_out_time)
from (select t.*,
sum(case when prev_lt = login_time then 0 else 1 end) over
(partition by user_id, product_id
order by login_time
rows between unbounded preceding and current row
) as grp
from (select t.*,
lag(log_out_time) over (partition by user_id, product_id order by login_time) as prev_lt
from t
) t
) t
group by user_id, product_id, grp;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.