簡體   English   中英

滾動聚合

[英]Rolling Aggregation

我正在嘗試在 SQL Server 中編寫一個基於滾動日期聚合的程序。

下面拿這個

Acc Dte     Amount
1   1/1/20    100
1   1/3/20    200
1   1/8/20    100
1   1/8/20     75
2   1/1/20     50
2   1/2/20    100
2   1/3/20     75
2   1/3/20    125
3   1/3/20    100
3   1/6/20     75
3   1/8/20     75
3   1/10/20   200
3   1/10/20   150

所以目標是我想在被分析的記錄之前找到每個帳戶的記錄和日期的平均值和計數。 我還需要根據日期對記錄求和 所以根據上面它看起來像這樣......

Acc  Dte    Num_of_dates Avg_Amount_per_day Current_Amount
1    1/3/20            1              100              200
1    1/8/20            2              150              175
2    1/2/20            1               50              100
2    1/3/20            2               75              200
3    1/6/20            1              100               75
3    1/8/20            2               83.3             75
3    1/10/20           3               83.3            350

目標是創建一個 z 分數,將當天的賬戶數量與每天的賬戶平均值進行比較。 但是我們還需要為每個帳戶至少提供 10 天的歷史數據。

現在我的代碼看起來像這樣並且不起作用

select Account, 
       Dte, 
       (select sum(case when Cast(EventTimestamp as DATE) < Dte then 1 else 0 end) Num_of_Date,
       (select (case when Cast(EventTimestamp as DATE) < Dte then sum(Amount) else 0 end) t_amount
from Data
group by Account, Dte

有任何想法嗎? 謝謝

您可以使用帶有適當rows子句的窗口函數。 這一次, distinct在這里派上用場:

select distinct
    acc,
    dte,
    count(*) over(
        partition by acc
        order by dte
        rows between unbounded preceding and 1 preceding
    ) num_of_dates,
    avg(1.0 * amount) over(
        partition by acc
        order by dte
        rows between unbounded preceding and 1 preceding
    ) avg_amount_per_day,
    sum(amount) over(partition by acc, dte) current_amount
from mytable

如果您確實希望每個日期和帳戶只需要一條記錄,如示例數據所示,您可以嵌套查詢並使用row_number() - 在沒有明顯的列來定義排序順序的情況下,我依賴於累積計數:

select acc, dte, num_of_dates, avg_amount_per_day, current_amount
from (
    select 
        t.*, 
        row_number() over(partition by acc, dte order by num_of_dates) rn
    from (
        select
            acc,
            dte,
            count(*) over(
                partition by acc 
                order by dte
                rows between unbounded preceding and 1 preceding
            ) num_of_dates,
            avg(1.0 * amount) over(
                partition by acc 
                order by dte
                rows between unbounded preceding and 1 preceding
            ) avg_amount_per_day,
            sum(amount) over(partition by acc, dte) current_amount
        from mytable
    ) t
) t
where rn = 1 and avg_amount_per_day is not null

DB Faddlde 上的演示

acc | dte        | num_of_dates | avg_amount_per_day | current_amount
--: | :--------- | -----------: | :----------------- | -------------:
  1 | 2020-01-03 |            1 | 100.000000         |            200
  1 | 2020-01-08 |            2 | 150.000000         |            175
  2 | 2020-01-02 |            1 | 50.000000          |            100
  2 | 2020-01-03 |            2 | 75.000000          |            200
  3 | 2020-01-06 |            1 | 100.000000         |             75
  3 | 2020-01-08 |            2 | 87.500000          |             75
  3 | 2020-01-10 |            3 | 83.333333          |            350

您的樣本數據和描述表明:

select acc, dte,
       count(*) as num_on_day,
       sum(amount) as sum_on_day,
       avg(sum(amount)) over (partition by acc order by date_num range between unbounded preceding and 1 preceding) as avg_previous
from t cross join
     (values (datediff(day, '1900-01-01', dte))) v(date_num)
group by acc, dte;

我不確定你為什么不包括每個acc的第一個日期。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM