![](/img/trans.png)
[英]VERTICA: Need to calculate rolling DISTINCT COUNTS for past 3 months
[英]Oracle: Need to calculate rolling average for past 3 months where we have more than one submission per month
我已經在甲骨文中看到了許多滾動平均值的例子,但是做了我想要的事情。
這是我的原始數據
DATE SCORE AREA
----------------------------
01-JUL-14 60 A
01-AUG-14 45 A
01-SEP-14 45 A
02-SEP-14 50 A
01-OCT-14 30 A
02-OCT-14 45 A
03-OCT-14 50 A
01-JUL-14 60 B
01-AUG-14 45 B
01-SEP-14 45 B
02-SEP-14 50 B
01-OCT-14 30 B
02-OCT-14 45 B
03-OCT-14 50 B
這是我的滾動平均值的理想結果
MMYY AVG AREA
-------------------------
JUL-14 60 A
AUG-14 52.5 A
SEP-14 50 A
OCT-14 44 A
JUL-14 60 B
AUG-14 52.5 B
SEP-14 50 B
OCT-14 44 B
我需要它的工作方式是,對於每個MMYY,我需要回顧3個月,然后AVG得分。 所以,例如,
對於OCT的A區,在距離10月的最后3個月中,有6項研究,(45 + 45 + 50 + 30 + 45 + 50)/ 6 = 44.1
通常我會像這樣編寫查詢
SELECT
AREA,
TO_CHAR(T.DT,'MMYY') MMYY,
ROUND(AVG(SCORE)
OVER (PARTITION BY AREA ORDER BY TO_CHAR(T.DT,'MMYY') ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),1)
AS AVG
FROM T
這將查看過去3個月內的最后3個回合
一種方法是將聚合函數與分析函數混合使用。 平均值的關鍵思想是避免使用avg()
,而是將sum()
除以count(*)
。
SELECT AREA, TO_CHAR(T.DT, 'MMYY') AS MMYY,
SUM(SCORE) / COUNT(*) as AvgScore,
SUM(SUM(SCORE)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) / SUM(COUNT(*)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM t
GROUP BY AREA, TO_CHAR(T.DT, 'MMYY') ;
請注意order by
子句。 如果您的數據跨越多年,則使用MMYY格式會產生問題。 最好使用YYYY-MM等格式數月,因為字母順序與自然順序相同。
您也可以指定范圍,而不僅僅是行。
SELECT
AREA,
TO_CHAR(T.DT,'MMYY') MMYY,
ROUND(AVG(SCORE)
OVER (PARTITION BY AREA
ORDER BY DT RANGE BETWEEN INTERVAL '3' MONTH PRECEDING AND CURRENT ROW))
AS AVG
FROM T
由於CURRENT ROW
是默認值,因此ORDER BY DT RANGE INTERVAL '3' MONTH PRECEDING
應該有效。 也許你必須做一些微調,我沒有測試關於每月28/29/30/31天問題的行為。
有關更多詳細信息,請查看Oracle Windowing子句 。
SQL> WITH DATA AS(
2 SELECT to_date('01-JUL-14','DD-MON-RR') dt, 60 score, 'A' area FROM dual UNION ALL
3 SELECT to_date('01-AUG-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
4 SELECT to_date('01-SEP-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
5 SELECT to_date('02-SEP-14','DD-MON-RR') dt, 50 score, 'A' area FROM dual UNION ALL
6 SELECT to_date('01-OCT-14','DD-MON-RR') dt, 30 score, 'A' area FROM dual UNION ALL
7 SELECT to_date('02-OCT-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
8 SELECT to_date('03-OCT-14','DD-MON-RR') dt, 50 score, 'A' area FROM dual UNION ALL
9 SELECT to_date('01-JUL-14','DD-MON-RR') dt, 60 score, 'B' area FROM dual UNION ALL
10 SELECT to_date('01-AUG-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
11 SELECT to_date('01-SEP-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
12 SELECT to_date('02-SEP-14','DD-MON-RR') dt, 50 score, 'B' area FROM dual UNION ALL
13 SELECT to_date('01-OCT-14','DD-MON-RR') dt, 30 score, 'B' area FROM dual UNION ALL
14 SELECT to_date('02-OCT-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
15 SELECT to_date('03-OCT-14','DD-MON-RR') dt, 50 score, 'B' area FROM dual)
16 SELECT TO_CHAR(T.DT, 'MON-RR') AS MMYY,
17 round(
18 SUM(SUM(SCORE)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)/
19 SUM(COUNT(*)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),1)
20 AS avg_score,
21 AREA
22 FROM data t
23 GROUP BY AREA, TO_CHAR(T.DT, 'MON-RR')
24 /
MMYY AVG_SCORE A
------ ---------- -
JUL-14 60 A
AUG-14 52.5 A
SEP-14 50 A
OCT-14 44.2 A
JUL-14 60 B
AUG-14 52.5 B
SEP-14 50 B
OCT-14 44.2 B
8 rows selected.
SQL>
從下次開始,我希望您提供create
和insert
語句,以便我們不必花時間准備test case
。
而且,為什么YY
格式? 你沒見過Y2K
bug嗎? 請使用YYYY
格式。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.