date_key | cust_id | sales |
---|---|---|
2022-01-01 | 1 | 30 |
2022-01-02 | 1 | 35 |
2022-01-05 | 1 | 38 |
2022-01-10 | 1 | 20 |
2022-01-11 | 1 | 35 |
2022-01-01 | 2 | 20 |
2022-01-02 | 2 | 25 |
2022-01-04 | 2 | 38 |
2022-01-09 | 2 | 20 |
2022-01-15 | 1 | 35 |
2022-01-11 | 3 | 35 |
I would like to get all customer_ids in the current period and left join the difference in sum(sales) between period 2022-01-01 -2022-01-05 and sum(sales) from period 2022-01-06 - 2022-01-11.
How would you achieve this in windows function? Currently I am using ctes
with
users as(
select
distinct cust_id
from
tableSales
where date_key between date('2022-01-06) and date('2022-01-11)),
currentPeriod as(
select
distinct cust_id
,sum(sales) sales
from users
left join tableSales using (customer_id)
where date_key between date('2022-01-06) and date('2022-01-11)
),
previousPeriod as(
select
distinct cust_id
,sum(sales) sales
from users
left join tableSales using (customer_id)
where date_key between date('2022-01-05) and date('2022-01-01)
)
#-----------------------
Select
distinct cust_id
,cp.sales - pp.sales deltaSales
from users
left join currentperiod cp using(customer_id)
left join previousperiod pp using(customer_id)
There must be a shorter way to achieve this using windows function? Please do help.
In your query there are missing quotations '
the fiels customer_id
and cust_id
should be the same, right?
The dates are switched: between date('2022-01-05) and date('2022-01-01)
The given time intervals are strange, because it is unclear, why the user needs them.
With window
function:
with tableSales as
(Select date_sub(date("2022-01-11"), interval cast(rand()*10 as int64) day ) date_key, cust_id,
cast(rand()*100 as int64) as sales
from unnest([1,2,3]) cust_id, unnest(generate_array(1,10,1)) a
)
,tmp as
(Select *,
sum(if(date_key between date('2022-01-06') and date('2022-01-11'), sales ,0 ) ) over (partition by cust_id) as currentperiod ,
sum(if(date_key between date('2022-01-01') and date('2022-01-05'), sales ,0 ) ) over (partition by cust_id) as previousperiod
from tableSales
)
Select distinct cust_id, currentperiod, previousperiod from tmp
Well, doing a ´group by` is much better:
with tableSales as
(Select date_sub(date("2022-01-11"), interval cast(rand()*10 as int64) day ) date_key, cust_id,
cast(rand()*100 as int64) as sales
from unnest([1,2,3]) cust_id, unnest(generate_array(1,10,1)) a
)
,tmp as
(Select cust_id,
sum(if(date_key between date('2022-01-06') and date('2022-01-11'), sales ,0 ) ) currentperiod ,
sum(if(date_key between date('2022-01-01') and date('2022-01-05'), sales ,0 ) ) previousperiod
from tableSales
group by 1
)
Select distinct cust_id, currentperiod, previousperiod from tmp
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.