[英]SQL Server Count Distinct records with a specific condition in window functions
I have a table similar to below:我有一个类似于下面的表:
Group![]() |
TradeMonth![]() |
flag![]() |
---|---|---|
A![]() |
Jan![]() |
1 ![]() |
A![]() |
Mar![]() |
1 ![]() |
A![]() |
Mar![]() |
0 ![]() |
A![]() |
Jun![]() |
0 ![]() |
B![]() |
Feb![]() |
1 ![]() |
B![]() |
Apr![]() |
1 ![]() |
B![]() |
Sep![]() |
1 ![]() |
B![]() |
Sep![]() |
1 ![]() |
I need to have a column that calculates the number of distinct months with non-zero values (flag=1) for each Group .我需要有一个列来计算每个 Group具有非零值(flag=1)的不同月份的数量。 I prefer using window function (not group by) and I know count distinct is not allowed in window functions in sql server, So any solution on how to calculate that with a window function is highly appreciated.
我更喜欢使用窗口函数(而不是分组依据),并且我知道 sql server 中的窗口函数中不允许使用 count distinct ,因此非常感谢有关如何使用窗口函数计算该值的任何解决方案。 The results should be as below:
结果应如下所示:
Group![]() |
TradeMonth![]() |
flag![]() |
#of flagged_months(Distinct) ![]() |
---|---|---|---|
A![]() |
Jan![]() |
1 ![]() |
2 ![]() |
A![]() |
Mar![]() |
1 ![]() |
2 ![]() |
A![]() |
Mar![]() |
0 ![]() |
2 ![]() |
A![]() |
Jun![]() |
0 ![]() |
2 ![]() |
B![]() |
Feb![]() |
1 ![]() |
3 ![]() |
B![]() |
Apr![]() |
1 ![]() |
3 ![]() |
B![]() |
Sep![]() |
1 ![]() |
3 ![]() |
B![]() |
Sep![]() |
1 ![]() |
3 ![]() |
Unfortunately you can't do COUNT(DISTINC ...) OVER ()
, but here is one workaround不幸的是你不能做
COUNT(DISTINC ...) OVER ()
,但这是一种解决方法
with
cte as
(
select *,
dr = dense_rank() over (partition by [Group], flag order by [TradeMonth])
from yourtable
)
select [Group], [TradeMonth], flag,
max(case when flag = 1 then dr end) over (partition by [Group])
from cte
Try this试试这个
create table #test([Group] varchar(1), TradeMon Varchar(10), Flag int)
insert into #test values ('A', 'Jan', 1),('A', 'Mar', 1),('A', 'Mar', 0),('A', 'Jun', 0),('B', 'Feb', 1),('B', 'Apr', 1),('B', 'Sep', 1),('B', 'Sep', 1)
With distinctCount AS
(
SELECT [group], COUNT(1)DistinctCount
FROM
(
select distinct [group], TradeMon
from #test
where flag=1
)T GROUP BY [group]
)
SELECT T.[GROUP], T.TradeMon, T.Flag, DC.DistinctCount
FROM #test T
INNER JOIN distinctCount DC ON (T.[GROUP] = DC.[Group])
You can actually do this with a single expression:你实际上可以用一个表达式来做到这一点:
select t.*,
(dense_rank() over (partition by group
order by (case when flag = 1 then trademonth end)
) +
dense_rank() over (partition by group
order by (case when flag = 1 then trademonth end) desc
) -
(1 + min(flag) over (partition by trademonth))
) as num_unique_flag_1
from t;
What is the logic here?这里的逻辑是什么? The sum of
dense_rank()
with an ascending sort and a descending sort is one more than the number of distinct values.具有升序排序和降序排序的
dense_rank()
的总和比不同值的数量多一。
Under normal circumstances (ie calculating the number of distinct months), you would just subtract 1.在正常情况下(即计算不同月份的数量),您只需减去 1。
In this case, though, you treat 0 values as NULL
.但是,在这种情况下,您将 0 值视为
NULL
。 These still get counted, but there is only one of them.这些仍然被计算在内,但只有其中之一。 So, you either subtract 1 or 2 depending on the presence of
0
values.因此,您可以根据是否存在
0
值减去 1 或 2。 Voila!瞧! The result is the number of distinct months with a flag of 1.
结果是标志为 1 的不同月份的数量。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.