[英]count distinct in SAS SQL case when
我有一個數據集
Outlet Period Brand Sales
A Jan XX 12
A Jan XY 13
A FEB AB 10
B JAN AC 19
我想計算每個時期內每個時期不同品牌的數量,但不包括品牌'CD'。 我想查詢一下,如下所示,計數在一行中不起作用,但只適用於例2嗎?
示例1(品牌CD即使不應該計算在內)
PROC SQL;
CREATE TABLE Test AS
SELECT
Outlet, Period, Brand,
case when Brand not in ('CD') then count (distinct Brand) end as k_Brands_Players2
group by period, outlet;
quit;
示例2(品牌CD未正確計算)
PROC SQL;
CREATE TABLE Test AS
SELECT
Outlet, Period, Brand,
case when Brand not in ('CD') then Brand else ' ' end as Brand_Players,
count(distinct calculated Brand_Players) as k_Brands_Players
group by period, outlet;
quit;
預期的產出是:
Outlet Period Brand k_Brands_Players
A Jan XX 2
A Jan XY 2
A Feb AS 3
A FEB QW 3
A Feb XY 3
B Jan KW 1
....
第一個查詢中的問題是您在錯誤的位置使用COUNT()聚合函數。 你有
case when Brand not in ('CD') then count(distinct Brand) end
因此,當BRAND等於'CD'時,您將獲得缺失值,否則您將獲得不同品牌的數量,包括'CD'品牌。
相反,如果你使用這種結構:
count(distinct case when Brand not in ('CD') then Brand end)
那么COUNT()函數會將'CD'值視為缺失值而不計算它。
嘗試這個:-
/*Count distinct will come outside the case when statement*/
PROC SQL;
CREATE TABLE Test AS
SELECT distinct Outlet, Period, Brand, k_Brands_Players
from
YOUR_DATASET_NAME a
LEFT JOIN
(
SELECT
Outlet, Period,
count(distinct(case when Brand not in ('CD') then Brand end)) as k_Brands_Players
FROM YOUR_DATASET_NAME
group by 1,2
) b
on a.Outlet=b.Outlet and a.Period=b.Period;
quit;
如果您有任何疑問,請告訴我
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.