![](/img/trans.png)
[英]Selecting values in a partition where a condition is met (Teradata SQL)
[英]Teradata SQL: Calculate running totals if a condition is met
我有一个包含以下列和数据的数据集:
Customer | Week_number | Amount
cust1 | 0 | 100
cust1 | 1 | 200
cust1 | 3 | 300
cust2 | 0 | 1000
cust2 | 1 | 2000
我需要为每个客户每两周计算一次总数。
使用窗口函数,我能够做到这一点:
SELECT
CUSTOMER, WEEK_NUMBER
, SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS 1 PRECEDING) AS FORTNIGHT_AMOUNT
FROM AMOUNT
但即使前一周没有金额,这也会增加金额。 在上面的示例中,对于 cust1,第 3 行,它将第 3 周和第 1 周相加。仅当 week_number 比当前行的周少 1 时才应添加数量。 这可能吗? 谢谢您的帮助。
我得到了什么:
Customer | Week_number | Fortnight_Amount
cust1 | 0 | 100
cust1 | 1 | 300
cust1 | 3 | **500**
cust2 | 0 | 1000
cust2 | 1 | 3000
预期结果:
Customer | Week_number | Fortnight_Amount
cust1 | 0 | 100
cust1 | 1 | 300
cust1 | 3 | **300**
cust2 | 0 | 1000
cust2 | 1 | 3000
如果只有两周/行,您的查询可以进一步简化为解释中的单个 STATS 步骤(因为两个 OLAP 函数都应用相同的 PARTITION/ORDER):
SELECT T.*
, CASE
WHEN MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) + 1 = WEEK_NUMBER
THEN SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ELSE AMOUNT
END AS TWO_WEEK_SUM_AMOUNT
FROM MY_TABLE T
ORDER BY CUSTOMER, WEEK_NUMBER
当然,这假设周从 0 开始,并且没有前一年的第 52/53 周。
如果您只想忽略不是立即顺序的周数,您可以先使用lag()
,然后执行窗口sum()
:
select
customer,
week_number,
sum(
case when lag_week_number is null or week_number = lag_week_number + 1
then amount
else 0
end
) over(partition by customer order by week_number) fortnight_amount
from (
select
t.*,
lag(week_number) over(partition by customer order by week_number) lag_week_number
from mytable t
) t
实际上,当周数存在差距时,您实际上可能想要重置sum
。 为此,这是某种间隙和孤岛分配,您将采取不同的方式:这个想法是在两个连续的周数 ae 顺序时进行累积sum
以开始一个新组,然后在每个组内求和:
select
customer,
week_number,
sum(amount) over(partition by customer, grp order by week_date) fortnight_amount
from (
select
t.*,
sum(
case
when lag_week_number is null or week_number = lag_week_number + 1
then 0
else 1
end
) grp
from (
select
t.*,
lag(week_number) over(partition by customer order by week_number) lag_week_number
from mytable t
) t
) t
您需要range
分区,而不是row
分区:
SELECT CUSTOMER, WEEK_NUMBER,
SUM(AMOUNT) OVER (PARTITION BY CUSTOMER
ORDER BY WEEK_NUMBER
RANGE BETWEEN 1 PRECEDING AND CURRENT ROW
) AS FORTNIGHT_AMOUNT
FROM AMOUNT;
感谢@Gordon 和@GMB 的回答。 不幸的是,我无法在 Teradata SQL 中同时使用 LAG 函数或 RANGE 分区。 但是我能够使用你们描述的概念来获得以下答案。
SELECT
CUSTOMER
, WEEK_NUMBER
, LAG_WEEK_NUMBER
, AMOUNT
, CASE
WHEN WEEK_NUMBER = LAG_WEEK_NUMBER + 1
THEN SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ELSE AMOUNT
END AS TWO_WEEK_SUM_AMOUNT
FROM (
SELECT
T.*
, MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER
FROM MY_TABLE T
) T
ORDER BY CUSTOMER, WEEK_NUMBER
我能够从这些链接中@dnoeth 的回答中获得 Teradata 中的 LAG 函数实现:
MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER
如果您发现答案有任何问题或者是否可以以任何方式改进,请告诉我。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.