简体   繁体   中英

SQL Server Count Distinct records with a specific condition in window functions

I have a table similar to below:

Group TradeMonth flag
A Jan 1
A Mar 1
A Mar 0
A Jun 0
B Feb 1
B Apr 1
B Sep 1
B Sep 1

I need to have a column that calculates the number of distinct months with non-zero values (flag=1) for each Group . I prefer using window function (not group by) and I know count distinct is not allowed in window functions in sql server, So any solution on how to calculate that with a window function is highly appreciated. The results should be as below:

Group TradeMonth flag #of flagged_months(Distinct)
A Jan 1 2
A Mar 1 2
A Mar 0 2
A Jun 0 2
B Feb 1 3
B Apr 1 3
B Sep 1 3
B Sep 1 3

Unfortunately you can't do COUNT(DISTINC ...) OVER () , but here is one workaround


with
cte as
(
    select  *,
            dr = dense_rank() over (partition by [Group], flag order by [TradeMonth])
    from    yourtable
)
select  [Group], [TradeMonth], flag,
        max(case when flag = 1 then dr end) over (partition by [Group])
from    cte

dbfiddle demo

Try this

create table #test([Group] varchar(1), TradeMon Varchar(10), Flag int)
insert into #test values ('A', 'Jan', 1),('A', 'Mar', 1),('A', 'Mar', 0),('A', 'Jun', 0),('B', 'Feb', 1),('B', 'Apr', 1),('B', 'Sep', 1),('B', 'Sep', 1)


With distinctCount AS
(
    SELECT [group], COUNT(1)DistinctCount
    FROM
    (
        select distinct  [group], TradeMon
        from #test
        where flag=1
    )T GROUP BY [group]
)
SELECT T.[GROUP], T.TradeMon, T.Flag, DC.DistinctCount
FROM #test T
INNER JOIN distinctCount DC ON  (T.[GROUP] = DC.[Group])

You can actually do this with a single expression:

select t.*,
       (dense_rank() over (partition by group
                           order by (case when flag = 1 then trademonth end)
                          ) +
        dense_rank() over (partition by group
                           order by (case when flag = 1 then trademonth end) desc
                          ) -
        (1 + min(flag) over (partition by trademonth))
       ) as num_unique_flag_1        
from t;

What is the logic here? The sum of dense_rank() with an ascending sort and a descending sort is one more than the number of distinct values.

Under normal circumstances (ie calculating the number of distinct months), you would just subtract 1.

In this case, though, you treat 0 values as NULL . These still get counted, but there is only one of them. So, you either subtract 1 or 2 depending on the presence of 0 values. Voila! The result is the number of distinct months with a flag of 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM