[英]Athena/SQL query to get the desired result
sample_input_table样本输入表
user name action date
1 aaa view 2020-09-03
2 bbb view 2020-09-02
3 ccc view 2020-08-28
4 ddd view 2020-08-25
1 aaa purchase 2020-09-09
I have a table with huge number of rows, the table looks like above.我有一个包含大量行的表,该表如下所示。
question题
purchase
action andpurchase
行为的行和purchase
must have row with view
actionpurchase
的用户必须有view
操作的行view
action will be in the date range of purchase_date( 2020-09-09
) and purchase_date - 7days( 2020-09-02
).view
操作将在 purchase_date( 2020-09-09
) 和 purchase_date - 7days( 2020-09-02
) 的日期范围内。 I want to achieve these 3 point in one sql query我想在一个 sql 查询中实现这 3 点
sample_output样本输出
user name action date
1 aaa purchase 2020-09-09
if we see sample output from the sample input如果我们从样本输入中看到样本输出
view
actionview
操作view
was there in the timeframe of 2020-09-09
and 2020-09-02
(purchased_date, purchased_date - 7 days)view
出现在2020-09-09
和2020-09-02
的时间范围内(purchased_date、purchase_date - 7 天) Can anyone suggest some solution for this?任何人都可以为此提出一些解决方案吗?
You can use exists
:您可以使用
exists
:
select t.*
from mytable t
where t.action = 'purchase' and exists (
select 1
from mytable t1
where
t1.user = t.user
and t1.action = 'view'
and t1.date >= t.date - interval '7' day
and t1.date < t.date
)
You can use window functions.您可以使用窗口函数。 Assuming "purchase" is the last state:
假设“购买”是最后一个状态:
select t.*
from (select t.*,
max(case when action = 'purchase' then date end) over (partition by user) as purchase_date,
max(case when action = 'view' then date end) over (partition by user) as max_view_date
from t
) t
where action = 'purchase' and
max_view_date >= purchase_date - interval '7 day';
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.