在SQL Server中一次性獲得DISTINCT COUNT

Question

我有一個如下表：

Region    Country    Manufacturer    Brand    Period    Spend
R1        C1         M1              B1       2016      5
R1        C1         M1              B1       2017      10
R1        C1         M1              B1       2017      20
R1        C1         M1              B2       2016      15
R1        C1         M1              B3       2017      20
R1        C2         M1              B1       2017      5
R1        C2         M2              B4       2017      25
R1        C2         M2              B5       2017      30
R2        C3         M1              B1       2017      35
R2        C3         M2              B4       2017      40
R2        C3         M2              B5       2017      45
...

我在下面編寫了查詢以匯總它們：

SELECT [Region]
    ,[Country]
    ,[Manufacturer]
    ,[Brand]
    ,Period
    ,SUM([Spend]) AS [Spend]
FROM myTable
GROUP BY [Region]
    ,[Country]
    ,[Manufacturer]
    ,[Brand]
    ,[Period]
ORDER BY 1,2,3,4

其結果如下：

Region    Country    Manufacturer    Brand    Period    Spend
R1        C1         M1              B1       2016      5
R1        C1         M1              B1       2017      30 -- this row is an aggregate from raw table above
R1        C1         M1              B2       2016      15
R1        C1         M1              B3       2017      20
R1        C2         M1              B1       2017      4  -- aggregated result
R1        C2         M2              B4       2017      25
R1        C2         M2              B5       2017      30
R2        C3         M2              B4       2017      40
R2        C3         M2              B5       2017      45

我想另一列添加到上面的表中顯示了DISTINCT COUNT的Brand通過分組Region ， Country ， Manufacturer和Period 。 因此，最終表將如下所示：

Region    Country    Manufacturer    Brand    Period    Spend    UniqBrandCount
R1        C1         M1              B1       2016      5        2 -- two brands by R1, C1, M1 in 2016
R1        C1         M1              B1       2017      30       1
R1        C1         M1              B2       2016      15       2 -- same as first row's result
R1        C1         M1              B3       2017      20       1
R1        C2         M1              B1       2017      4        1
R1        C2         M2              B4       2017      25       2
R1        C2         M2              B5       2017      30       2
R2        C3         M2              B4       2017      40       2
R2        C3         M2              B5       2017      45       2

我知道如何通過三個步驟獲得最終結果。

運行以下查詢（查詢1）：
從myTable GROUP中按[地區]，[國家/地區]，[制造商]，[期間]，將[地區]，[國家/地區]，[制造商]，[期間]，COUNT（DISTINCT [品牌]）作為[品牌數]轉換為Temp1
運行此查詢（查詢2）
SELECT [地區]，[國家]，[制造商]，[品牌]，年（[期間]）AS期間，SUM（[支出]）AS [支出]從myTable GROUP BY中按[區域]，[國家]進入Temp2 [制造商]，[品牌]，[期間]
然后LEFT JOIN Temp2和Temp1 ，從后者引入[BrandCount] ，如下所示：
從Temp2中選擇a。*，b。*作為左連接Temp1 AS作為b在a。[Region] = b。[Region] and a。[Country] = b。[Country] AND a。[Advertiser] = b上。 [Advertiser] AND a。[Period] = b。[Period]

我敢肯定，有一種更有效的方法可以做到這一點，對嗎？ 預先感謝您的建議/答案！

Answer 1

您問題的標簽；

窗口功能

建議您有個不錯的主意。

對於按地區，國家，制造商和時期分組的品牌識別數量 ：您可以輸入：

Select   Region 
        ,Country
        ,Manufacturer
        ,Brand
        ,Period
        ,Spend
        ,DENSE_RANK() Over (Partition By Region, Country, Manufacturer, Period Order By Brand asc) 
         + DENSE_RANK() Over (Partition By Region, Country, Manufacturer, Period Order By Brand desc) 
         -1 UniqBrandCount
From myTable T1
Order By 1,2,3,4

Answer 2

從這個問題中大量借用： https ： //dba.stackexchange.com/questions/89031/using-distinct-in-window-function-with-over

Count Distinct不起作用，因此需要density_rank。 按正向和反向順序對品牌進行排名，然后再減去1即可得出不同的計數。

您的sum函數也可以使用PARTITION BY邏輯進行重寫。 這樣，您可以為每個聚合使用不同的分組級別：

SELECT 
[Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]
,dense_rank() OVER 
    (PARTITION BY 
     [Region] 
    ,[Country]
    ,[Manufacturer]
    ,[Period] Order by Brand) 
+ dense_rank() OVER 
    (PARTITION BY 
     [Region] 
    ,[Country]
    ,[Manufacturer]
    ,[Period] Order by Brand Desc) 
- 1  
AS [BrandCount]
,SUM([Spend]) OVER
    (PARTITION BY
     [Region] 
    ,[Country]
    ,[Manufacturer]
    ,[Brand]
    ,[Period]) as [Spend]
from
myTable
ORDER BY 1,2,3,4

然后，您可能需要減少輸出中的行數，因為此語法給出的行數與myTable相同，但是聚合總計出現在它們適用的每一行上：

R1  C1  M1  B1  2016    2   5
R1  C1  M1  B1  2017    2   30 --dup1
R1  C1  M1  B1  2017    2   30 --dup1
R1  C1  M1  B2  2016    2   15
R1  C1  M1  B3  2017    2   20
R1  C2  M1  B1  2017    1   5
R1  C2  M2  B4  2017    2   25
R1  C2  M2  B5  2017    2   30
R2  C3  M1  B1  2017    1   35
R2  C3  M2  B4  2017    2   40
R2  C3  M2  B5  2017    2   45

從此輸出中選擇不同的行即可滿足您的需求。

density_rank技巧如何工作

考慮以下數據：

Col1    Col2
B       1
B       1
B       3
B       5
B       7
B       9

density_rank（）根據當前項之前的不同項的數量加1對數據進行排名。

1-> 1、3-> 2、5-> 3、7-> 4、9-> 5。

以相反的順序（使用desc ）產生相反的模式：

1-> 5、3-> 4、5-> 3、7-> 2、9-> 1：

將這些等級加在一起得出相同的值：

1 + 5 = 2 + 4 = 3 + 3 = 4 + 2 = 5 + 1 = 6

這里的措辭很有幫助，

(number of distinct items before + 1) + (number of distinct items after + 1) 
= number of distinct OTHER items before AND after + 2 
= Total number of distinct items + 1

因此，要獲得不同項的總數，請將ascending和descending density_ranks加在一起，然后減去1。

Answer 3

double dense_rank想法意味着您需要兩種排序方式（假設不存在提供排序順序的索引）。 假設沒有NULL品牌（就像這個想法一樣），您可以使用單個dense_rank和窗口MAX ，如下所示（演示）

WITH T1
     AS (SELECT *,
                DENSE_RANK() OVER (PARTITION BY [Region], [Country], [Manufacturer], [Period] ORDER BY Brand) AS [dr]
         FROM   myTable),
     T2
     AS (SELECT *,
                MAX([dr]) OVER (PARTITION BY [Region], [Country], [Manufacturer], [Period]) AS UniqBrandCount
         FROM   T1)
SELECT [Region],
       [Country],
       [Manufacturer],
       [Brand],
       Period,
       SUM([Spend])        AS [Spend],
       MAX(UniqBrandCount) AS UniqBrandCount
FROM   T2
GROUP  BY [Region],
          [Country],
          [Manufacturer],
          [Brand],
          [Period]
ORDER  BY [Region],
          [Country],
          [Manufacturer],
          [Period],
          Brand

上面有一些不可避免的假脫機（不可能以100％流式傳輸的方式進行），而是一種。

奇怪的是，需要最后的order by子句將排序數減少到一（如果有合適的索引，則為零）。

在SQL Server中一次性獲得DISTINCT COUNT

問題描述

3 個解決方案

解決方案1
2 2018-05-23 22:27:10

解決方案2
2 已采納 2018-05-23 23:04:31

density_rank技巧如何工作

解決方案3
1 2018-05-23 23:25:10

在SQL Server中一次性獲得DISTINCT COUNT

問題描述

3 個解決方案

解決方案1 2 2018-05-23 22:27:10

解決方案2 2 已采納 2018-05-23 23:04:31

density_rank技巧如何工作

解決方案3 1 2018-05-23 23:25:10

解決方案1
2 2018-05-23 22:27:10

解決方案2
2 已采納 2018-05-23 23:04:31

解決方案3
1 2018-05-23 23:25:10