简体   繁体   中英

SQL finding number of users in consecutive months in a table

I have an table with the following columns

email        ----        created at
abc@gmail.com        2019-12-12 16:03:34
rp@gamil.com         2019-11-12 16:03:34
abc@gmail.com        2020-1-12 16:03:34
er@gmail.com         2020-1-12 16:03:34

I want to design a query that return the back number of emails that registered in consecutive 2 months. I am no novice with queries and have been struggling to come up with a query for this.

For the above the query abc@gmail.com was registered twice in consecutive months

You can use an EXISTS query to check if an email exists that also had a registration in the previous month:

SELECT DISTINCT email
FROM yourtable t1
WHERE EXISTS (SELECT *
              FROM yourtable t2
              WHERE t2.email = t1.email 
                AND DATE_FORMAT(t2.createdat, '%Y%m') = DATE_FORMAT(t1.createdat - INTERVAL 1 MONTH, '%Y%m'))

Output for your sample data

abc@gmail.com

Demo on dbfiddle

We use DISTINCT so we don't get multiple copies of the same email if an email address is registered in more than one consecutive month.

By doing a self-join for Month + 1 and email (and also taking December-to-January transitions into account) this should work:

SELECT
    *
FROM
    (
        SELECT
            email,
            YEAR( created ) AS createdYear,
            MONTH( created ) AS createdMonth
        FROM
            table
    ) AS t

    INNER JOIN
    (
        SELECT
            email,
            YEAR( created ) AS createdYear,
            MONTH( created ) AS createdMonth
        FROM
            table
    ) AS monthPlus1 ON 
        t.email = monthPlus1.email
        AND
        (
            (
                t.createdMonth = monthPlus1.createdMonth + 1
                AND
                t.createdYear = monthPlus1.createdYear
            )
            OR
            (
                t.createdMonth = 12
                AND
                monthPlus1.createdMonth = 1
                AND
                t.createdYear + 1 = monthPlus1.createdYear
            )
        )

The date logic in this query is a bit gnarly - it can probably be improved by representing the month as a single date value or integer months-since-epoc rather than a year + month tuple.

You can use lag() . If this occurs, then lag() will be in two adjacent months.

select t.email
from (select t.*
             lag(created_at) over (partition by t.email order by created_at) as prev_created_at
      from t
     ) t
where extract(year_month from created_at) = extract(year_month from (prev_created_at + interval 1 month));

You may need select distinct , if this can occur multiple times.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM