繁体   English   中英

如何在 bigquery sql 中使用 windows function 在一段时间内获取聚合值的差异?

[英]How to get difference in aggregated value over a certain period of time using windows function in bigquery sql?

日期键 cust_id 销售量
2022-01-01 1 30
2022-01-02 1 35
2022-01-05 1 38
2022-01-10 1 20
2022-01-11 1 35
2022-01-01 2 20
2022-01-02 2 25
2022-01-04 2 38
2022-01-09 2 20
2022-01-15 1 35
2022-01-11 3 35

我想获取当前期间的所有 customer_ids,然后加入 2022-01-01 -2022-01-05 期间的总和(销售额)与 2022-01-06 至 2022-01 期间的总和(销售额)之间的差异-11。

您将如何在 windows function 中实现这一点? 目前我正在使用 ctes

with 
users as(
 select 
  distinct cust_id 
 from 
  tableSales 
  where date_key between date('2022-01-06) and date('2022-01-11)),
currentPeriod as(
 select
  distinct cust_id
  ,sum(sales) sales
 from users
  left join tableSales using (customer_id)
  where date_key between date('2022-01-06) and date('2022-01-11)
),
previousPeriod as(
 select
  distinct cust_id
  ,sum(sales) sales
 from users
 left join tableSales using (customer_id)
 where date_key between date('2022-01-05) and date('2022-01-01)
)
#-----------------------
Select 
 distinct cust_id 
 ,cp.sales - pp.sales deltaSales
 from users
left join currentperiod cp using(customer_id)
left join previousperiod pp using(customer_id)

使用 windows function 必须有更短的方法来实现这一点? 请帮忙。

在您的查询中缺少引号'字段customer_idcust_id应该相同,对吗?

日期切换: between date('2022-01-05) and date('2022-01-01)

给定的时间间隔很奇怪,因为不清楚用户为什么需要它们。

使用window function:

with tableSales as 
(Select date_sub(date("2022-01-11"), interval cast(rand()*10 as int64) day ) date_key, cust_id,
cast(rand()*100 as int64) as sales
from unnest([1,2,3]) cust_id, unnest(generate_array(1,10,1)) a
)
,tmp as 
(Select *,
sum(if(date_key between date('2022-01-06') and date('2022-01-11'), sales ,0 ) ) over (partition by cust_id) as currentperiod ,
sum(if(date_key between date('2022-01-01') and date('2022-01-05'), sales ,0 ) ) over (partition by cust_id) as previousperiod 
 from tableSales
 )
 Select distinct cust_id, currentperiod, previousperiod from tmp

好吧,做一个“分组”要好得多:

with tableSales as 
(Select date_sub(date("2022-01-11"), interval cast(rand()*10 as int64) day ) date_key, cust_id,
cast(rand()*100 as int64) as sales
from unnest([1,2,3]) cust_id, unnest(generate_array(1,10,1)) a
)
,tmp as 
(Select cust_id,
sum(if(date_key between date('2022-01-06') and date('2022-01-11'), sales ,0 ) )   currentperiod ,
sum(if(date_key between date('2022-01-01') and date('2022-01-05'), sales ,0 ) )   previousperiod 
 from tableSales
 group by 1
 )
 Select distinct cust_id, currentperiod, previousperiod from tmp

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM