简体   繁体   English

使用分区运行不同的计数

[英]Running Distinct Count with a Partition

I'd like a running distinct count with a partition by year for the following data:对于以下数据,我想要一个按年分区的运行不同计数:

DROP TABLE IF EXISTS #FACT;
CREATE TABLE #FACT("Year" INT,"Month" INT, "Acc" varchar(5));
INSERT INTO #FACT
    values 
        (2015, 1, 'A'),
        (2015, 1, 'B'),
        (2015, 1, 'B'),
        (2015, 1, 'C'),
        (2015, 2, 'D'),
        (2015, 2, 'E'),
        (2015, 3, 'E'),
        (2016, 1, 'A'),
        (2016, 1, 'A'),
        (2016, 2, 'B'),
        (2016, 2, 'C');
SELECT * FROM #FACT;    

The following returns the correct answer but is there a more concise way that is also performant?以下返回正确的答案,但是否有更简洁的方法也有效?

WITH 
dnsRnk AS
(
    SELECT 
        "Year"
        , "Month"
        , DenseR  = DENSE_RANK() OVER(PARTITION BY "Year", "Month" ORDER BY "Acc")
    FROM #FACT
),
mxPerMth AS
(
    SELECT
        "Year"
        , "Month"
        , RunningTotal = MAX(DenseR)
    FROM dnsRnk
    GROUP BY 
        "Year"
        , "Month"
)
SELECT 
    "Year"
    , "Month"
    , X = SUM(RunningTotal) OVER (PARTITION BY "Year" ORDER BY "Month")
FROM mxPerMth
ORDER BY 
    "Year"
    , "Month";

The above returns the following - the answer should also return exactly the same table:以上返回以下内容 - 答案也应该返回完全相同的表:

在此处输入图片说明

If you want a running count of distinct accounts:如果您想要不同帐户的运行计数:

SELECT f.*,
    sum(case when seqnum = 1 then 1 else 0 end) over (partition by year order by month) as cume_distinct_acc
FROM (
    SELECT 
        f.*
        ,row_number() over (partition by account order by year, month) as seqnum
    FROM #fact f
) f;

This counts each account during the first month when it appears.这会在出现的第一个月内对每个帐户进行计数。

EDIT:编辑:

Oops.哎呀。 The above doesn't aggregate by year and month and then start over for each year.以上不按年和月汇总,然后每年重新开始。 Here is the correct solution:这是正确的解决方案:

SELECT 
    year
    ,month
    ,sum( sum(case when seqnum = 1 then 1 else 0 end)
        ) over (partition by year order by month) as cume_distinct_acc
FROM (
    SELECT 
        f.*
        ,row_number() over (partition by account, year order by month) as seqnum
    FROM #fact f
) f
group by year, month
order by year, month;

And, SQL Fiddle isn't working but the following is an example:而且,SQL Fiddle 不起作用,但以下是一个示例:

with FACT as (
    SELECT yyyy, mm, account
    FROM (values 
        (2015, 1, 'A'),
        (2015, 1, 'B'),
        (2015, 1, 'B'),
        (2015, 1, 'C'),
        (2015, 2, 'D'),
        (2015, 2, 'E'),
        (2015, 3, 'E'),
        (2016, 1, 'A'),
        (2016, 1, 'A'),
        (2016, 2, 'B'),
        (2016, 2, 'C')) v(yyyy, mm, account)
)
SELECT 
    yyyy
    ,mm
    ,sum(sum(case when seqnum = 1 then 1 else 0 end)) over (partition by yyyy order by mm) as cume_distinct_acc
FROM (
    SELECT 
        f.*
        ,row_number() over (partition by account, yyyy order by mm) as seqnum
    FROM fact f
) f
group by yyyy, mm
order by yyyy, mm;

Demo Here: 演示在这里:

;with cte as (
    SELECT yearr, monthh, count(distinct acc) as cnt  
    FROM #fact
    GROUP BY yearr, monthh
)
SELECT 
    yearr
    ,monthh
    ,sum(cnt) over (Partition by yearr order by yearr, monthh rows unbounded preceding ) as x
FROM cte

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM