简体   繁体   中英

Struggling with a min(date) solution

I have a dataset with these columns:

  • Date
  • Code
  • Link
  • Clicks

Each code represents a person, and my goal is to find the first date someone posted each link and how many clicks each link has. To be considered a legitimate post, a link must have a total of at least 100 clicks. This query ALMOST does what I need:

select 
 min(date) as post_date,
    max(date) as last_click,
 link,
  code,
 sum(clicks) as clicks 
from table1
where date >= '2017-01-01'
group by link, code
having sum(clicks) > 100

The problem is this query is giving me the first date ANY clicks registered for links with 100+ clicks. For example:

PersonA got 1 click for linkA on January 1st. On January 7th linkA reaches a total of 100+ clicks. Right now this query is returning January 1st, when I need it to return January 7th.

I hope I was clear enough (again very new to this). Any help would be GREATLY appreciated! :)

This is tricky. I think you need to use variables for this, in order to get the cumulative sum of the number of clicks (actually, you could use a correlated subquery as well):

select min(date) as post_date, max(date) as last_click,
       link, code,
       max(running_clicks) as clicks 
from (select t1.*,
             (@c := if(@lc = concat_ws(':', link, code), @c + clicks,
                       if(@lc := concat_ws(':', link, code), clicks, clicks)
                      )
             ) as running_clicks
      from table1 t1 cross join
           (select @rn := 0, @lc := '') params
      where t1.date >= '2017-01-01'
      order by link, code
     ) t
group by link, code
having running_clicks - clicks > 100;

Try this:

select
    min(date) post_date,
    max(date) last_click_date,
    link,
    code,
    max(sum_clicks) clicks_total
from (
    select
        t.*,
        @sum := if(@lastcode = code and @lastlink = link, @sum + clicks,
            if(
                (@lastcode := code) is not null and 
                (@lastlink := link) is not null,
                clicks, clicks
            )
        ) sum_clicks
    from (
        select *
        from table1
        order by code, link, date
    ) t cross join (
        select 
            @lastcode := null,
            @lastlink := null,
            @sum      := 0
    ) t2
) t 
where sum_clicks >= 100
group by link, code;

It uses user variables to cumulatively add the clicks in order of date for each user and filters out those rows where the count is not yet reached.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM