简体   繁体   English

SQL 红移查询到 select 每个组内的前 x 个日期

[英]SQL Redshift query to select first x dates within each group

Suppose my table looks like the following假设我的表如下所示

user_id   login_date
1   2019-03-13 00:00:00.000000
1   2019-04-07 00:00:00.000000
1   2018-10-19 00:00:00.000000
1   2018-11-12 00:00:00.000000
1   2018-04-11 00:00:00.000000
6   2018-11-18 00:00:00.000000
6   2018-07-07 00:00:00.000000
6   2019-09-04 00:00:00.000000
6   2018-07-31 00:00:00.000000
6   2019-10-20 00:00:00.000000
12  2018-12-17 00:00:00.000000
12  2018-07-06 00:00:00.000000
12  2018-04-21 00:00:00.000000
12  2019-07-28 00:00:00.000000
48  2018-12-01 00:00:00.000000
48  2019-11-11 00:00:00.000000
48  2019-03-10 00:00:00.000000
48  2018-10-13 00:00:00.000000
48  2019-02-21 00:00:00.000000
48  2018-01-04 00:00:00.000000

I would like to select the logins within first 2 days after the first login.我想在第一次登录后的前 2 天内登录 select。 In other words, first have to find the minimum login date per group, and then select the ones that are within 48 hours, or sort the logins within each group and select the ones within first 2 days.换句话说,首先必须找到每个组的最小登录日期,然后是 select 48 小时内的,或者对每个组内的登录进行排序,select 是前 2 天内的登录。

here is the SQL to create a similar table这里是 SQL 创建一个类似的表

CREATE TABLE TEST (user_id INT, login_date DATE NOT NULL)
INSERT INTO TEST ( user_id, login_date)
VALUES
(1,'20190901'),
(1,'20140719'),
(1,'20101118'),
(1,'20101119'),
(1,'20141118'),
(6,'20110818'),
(6,'20070119'),
(6,'20090419'),
(6,'20070118'),
(6,'20100219'),
(12,'20120718'),
(12,'20070618'),
(12,'20041218'),
(12,'20041219'),
(48,'20120118'),
(48,'20111119'),
(48,'20031019'),
(48,'20100318'),
(48,'20021119'),
(48,'20010418')

You could use window function first_value() in a subquery to retrieve the earliest login date per group, and then compare it to each login date in the outer query:您可以在子查询中使用 window function first_value()来检索每个组的最早登录日期,然后将其与外部查询中的每个登录日期进行比较:

select 
    id, 
    login
from (
    select 
        t.*,
        first_value(login) over(
            partition by id 
            order by login
            rows between unbounded preceding and unbounded following
        ) first_login
    from mytable t
) t
where login < first_login + interval '2 days'

Another option is to use a correlated subquery for filtering:另一种选择是使用相关子查询进行过滤:

select *
from mytable t
where login < (
    select min(login) + interval '2 days'
    from mytable t1
    where t1.id = t.id
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM