[英]Selecting values in a partition where a condition is met (Teradata SQL)
[英]Teradata SQL: Calculate running totals if a condition is met
我有一個包含以下列和數據的數據集:
Customer | Week_number | Amount
cust1 | 0 | 100
cust1 | 1 | 200
cust1 | 3 | 300
cust2 | 0 | 1000
cust2 | 1 | 2000
我需要為每個客戶每兩周計算一次總數。
使用窗口函數,我能夠做到這一點:
SELECT
CUSTOMER, WEEK_NUMBER
, SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS 1 PRECEDING) AS FORTNIGHT_AMOUNT
FROM AMOUNT
但即使前一周沒有金額,這也會增加金額。 在上面的示例中,對於 cust1,第 3 行,它將第 3 周和第 1 周相加。僅當 week_number 比當前行的周少 1 時才應添加數量。 這可能嗎? 謝謝您的幫助。
我得到了什么:
Customer | Week_number | Fortnight_Amount
cust1 | 0 | 100
cust1 | 1 | 300
cust1 | 3 | **500**
cust2 | 0 | 1000
cust2 | 1 | 3000
預期結果:
Customer | Week_number | Fortnight_Amount
cust1 | 0 | 100
cust1 | 1 | 300
cust1 | 3 | **300**
cust2 | 0 | 1000
cust2 | 1 | 3000
如果只有兩周/行,您的查詢可以進一步簡化為解釋中的單個 STATS 步驟(因為兩個 OLAP 函數都應用相同的 PARTITION/ORDER):
SELECT T.*
, CASE
WHEN MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) + 1 = WEEK_NUMBER
THEN SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ELSE AMOUNT
END AS TWO_WEEK_SUM_AMOUNT
FROM MY_TABLE T
ORDER BY CUSTOMER, WEEK_NUMBER
當然,這假設周從 0 開始,並且沒有前一年的第 52/53 周。
如果您只想忽略不是立即順序的周數,您可以先使用lag()
,然后執行窗口sum()
:
select
customer,
week_number,
sum(
case when lag_week_number is null or week_number = lag_week_number + 1
then amount
else 0
end
) over(partition by customer order by week_number) fortnight_amount
from (
select
t.*,
lag(week_number) over(partition by customer order by week_number) lag_week_number
from mytable t
) t
實際上,當周數存在差距時,您實際上可能想要重置sum
。 為此,這是某種間隙和孤島分配,您將采取不同的方式:這個想法是在兩個連續的周數 ae 順序時進行累積sum
以開始一個新組,然后在每個組內求和:
select
customer,
week_number,
sum(amount) over(partition by customer, grp order by week_date) fortnight_amount
from (
select
t.*,
sum(
case
when lag_week_number is null or week_number = lag_week_number + 1
then 0
else 1
end
) grp
from (
select
t.*,
lag(week_number) over(partition by customer order by week_number) lag_week_number
from mytable t
) t
) t
您需要range
分區,而不是row
分區:
SELECT CUSTOMER, WEEK_NUMBER,
SUM(AMOUNT) OVER (PARTITION BY CUSTOMER
ORDER BY WEEK_NUMBER
RANGE BETWEEN 1 PRECEDING AND CURRENT ROW
) AS FORTNIGHT_AMOUNT
FROM AMOUNT;
感謝@Gordon 和@GMB 的回答。 不幸的是,我無法在 Teradata SQL 中同時使用 LAG 函數或 RANGE 分區。 但是我能夠使用你們描述的概念來獲得以下答案。
SELECT
CUSTOMER
, WEEK_NUMBER
, LAG_WEEK_NUMBER
, AMOUNT
, CASE
WHEN WEEK_NUMBER = LAG_WEEK_NUMBER + 1
THEN SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ELSE AMOUNT
END AS TWO_WEEK_SUM_AMOUNT
FROM (
SELECT
T.*
, MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER
FROM MY_TABLE T
) T
ORDER BY CUSTOMER, WEEK_NUMBER
我能夠從這些鏈接中@dnoeth 的回答中獲得 Teradata 中的 LAG 函數實現:
MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER
如果您發現答案有任何問題或者是否可以以任何方式改進,請告訴我。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.