简体   繁体   中英

Running Count Distinct using Over Partition By

I have a data set with user ids that have made purchases over time. I would like to show a YTD distinct count of users that have made a purchase, partitioned by State and Country. The output would have 4 columns: Country, State, Year, Month, YTD Count of Distinct Users with purchase activity.

Is there a way to do this? The following code works when I exclude the month from the view and do a distinct count:

Select Year, Country, State,
   COUNT(DISTINCT (CASE WHEN ActiveUserFlag > 0 THEN MBR_ID END)) AS YTD_Active_Member_Count
From MemberActivity
Where Month <= 5
Group By 1,2,3;

The issue occurs when the user has purchases across multiple months, because I can't aggregate at a monthly level then sum, because it duplicates user counts.

I need to see the YTD count for each month of the year, for trending purposes.

Count users in the first month they appear:

select Country, State, year, month,
       sum(case when ActiveUserFlag > 0 and seqnum = 1 then 1 else 0 end) as YTD_Active_Member_Count
from (select ma.*,
             row_number() over (partition by year order by month) as seqnum
      from MemberActivity ma
     ) ma
where Month <= 5
group by Country, State, year, month;

Return each member only once for the first month they make a purchase, count by month and then apply a Cumulative Sum:

select Year, Country, State, month,
   sum(cnt)
   over (partition by Year, Country, State
         order by month
         rows unbounded preceding) AS YTD_Active_Member_Count
from
  (
    Select Year, Country, State, month,
       COUNT(*) as cnt -- 1st purchses per month
    From 
     ( -- this assumes there's at least one new active member per year/month/country
       -- otherwise there would be mising rows 
       Select *
       from MemberActivity
       where ActiveUserFlag > 0 -- only active members
         and Month <= 5
         -- and year = 2019 -- seems to be for this year only
       qualify row_number() -- only first purchase per member/year
               over (partition by MBR_ID, year
                     order by month --? probably there's a purchase_date) = 1
     ) as dt
    group by 1,2,3,4
 ) as dt
;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM