简体   繁体   English

Athena/SQL 查询以获得所需的结果

[英]Athena/SQL query to get the desired result

sample_input_table样本输入表

user  name  action      date
 1    aaa    view      2020-09-03
 2    bbb    view      2020-09-02
 3    ccc    view      2020-08-28
 4    ddd    view      2020-08-25
 1    aaa    purchase  2020-09-09

I have a table with huge number of rows, the table looks like above.我有一个包含大量行的表,该表如下所示。

question

  1. i want to print the rows which have purchase action and我想打印具有purchase行为的行和
  2. at the same time, the user who did purchase must have row with view action同时, purchase的用户必须有view操作的行
  3. and at the same time, that view action will be in the date range of purchase_date( 2020-09-09 ) and purchase_date - 7days( 2020-09-02 ).同时,该view操作将在 purchase_date( 2020-09-09 ) 和 purchase_date - 7days( 2020-09-02 ) 的日期范围内。

I want to achieve these 3 point in one sql query我想在一个 sql 查询中实现这 3 点

sample_output样本输出

user  name  action      date
1    aaa    purchase  2020-09-09

if we see sample output from the sample input如果我们从样本输入中看到样本输出

  1. our end result have only purchase_events我们的最终结果只有 purchase_events
  2. purchased_user had a row with view action购买的用户有一行view操作
  3. and that view was there in the timeframe of 2020-09-09 and 2020-09-02 (purchased_date, purchased_date - 7 days)并且该view出现在2020-09-092020-09-02的时间范围内(purchased_date、purchase_date - 7 天)

Can anyone suggest some solution for this?任何人都可以为此提出一些解决方案吗?

You can use exists :您可以使用exists

select t.*
from mytable t
where t.action = 'purchase' and exists (
    select 1
    from mytable t1
    where 
        t1.user = t.user 
        and t1.action = 'view'
        and t1.date >= t.date - interval '7' day
        and t1.date < t.date
    )

You can use window functions.您可以使用窗口函数。 Assuming "purchase" is the last state:假设“购买”是最后一个状态:

select t.*
from (select t.*,
             max(case when action = 'purchase' then date end) over (partition by user) as purchase_date,
             max(case when action = 'view' then date end) over (partition by user) as max_view_date             
      from t
     ) t
where action = 'purchase' and
      max_view_date >= purchase_date - interval '7 day';

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM