I have a dataset with weekly customer and store information,
question- I have to calculate features such as total unique customers in last 1,2,3,4,5,6..so on weeks as of current week.
I am getting error while using count distinct of customers column over window function -
I tried concat function to create array which also didn't work- thanks for help!
SELECT STORE,WEEK,
count(distinct Customers) over (partition by STORE order by WEEK rows between 1 preceding and 1 preceding) as last_1_week_customers,
count(distinct Customers) over (partition by STORE order by WEEK rows between 2 preceding and 2 preceding) as last_2_week_customers
from TEST_TABLE
group by STORE,WEEK
error- SQL compilation error: distinct cannot be used with a window frame or an order.
how can I fix this error?
Input
CREATE TABLE TEST_TABLE (STORE STRING,WEEK STRING,Customers STRING);
INSERT INTO TEST_TABLE VALUES
('A','1','AA'),
('A','1','DD'),
('A','2','AA'),
('A','2','BB'),
('A','2','CC'),
('A','3','AA');
Output
Hmm... I think you don't really need window functions here at all...
First of all, we can start off with a simple grouping:
select
store,
week,
count(distinct customers) as cnt
from
test_table
where
week >= [this week's number minus 5]
group by
store, week
This will result in a simple table:
store | week | cnt |
---|---|---|
A | 1 | 2 |
A | 2 | 3 |
A | 3 | 1 |
At this point I'd ask you to consider if maybe this is already enough. It could be that you can already use the data in this format for whatever purpose you need. But if not, then we can further modify this to get a "pivoted" output.
In this query, replace ${w}
with this week's number:
select
store,
count(distinct case when week=${w} then customers else null end) as cnt_now,
count(distinct case when week=${w-1} then customers else null end) as cnt_minus_1,
count(distinct case when week=${w-2} then customers else null end) as cnt_minus_2,
count(distinct case when week=${w-3} then customers else null end) as cnt_minus_3,
count(distinct case when week=${w-4} then customers else null end) as cnt_minus_4,
count(distinct case when week=${w-5} then customers else null end) as cnt_minus_5
from
test_table
where
week >= {$w-5}
group by
store
Remember - COUNT()
and COUNT(DISTINCT)
only count NON-NULL values.
store | cnt_now | cnt_minus_1 | cnt_minus_2 | cnt_minus_3 | cnt_minus_4 | cnt_minus_5 |
---|---|---|---|---|---|---|
A | 1 | 2 | 3 | 4 | 5 | 6 |
B | 9 | 8 | 7 | 6 | 5 | 4 |
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.