I'm attempting to write a SQL query which returns every product where the most recent price on an order within the last 30 days is different than the most recent price in the previous 30 days, and that calculated variance. I'm currently using PostgreSQL 11.
Right now, the data is structured into three tables: orders
, products
, and a pivot table, order_product
. Here is the simplified version of the table structure:
id | order_date |
---|---|
1 | 2022-01-15 |
2 | 2022-02-15 |
3 | 2022-03-08 |
id | name |
---|---|
1 | Some product |
2 | Another product |
3 | Yet another product |
order_id | product_id | unit_price |
---|---|---|
1 | 1 | 10 |
1 | 2 | 20 |
1 | 3 | 10 |
2 | 1 | 12 |
2 | 2 | 20 |
2 | 3 | 5 |
3 | 1 | 15 |
The desired output would be something like the following:
id | name | order_date | latest_unit_price | previous_unit_price | variance |
---|---|---|---|---|---|
1 | Some product | 2022-03-08 | 15 | 10 | 5 |
3 | Yet another product | 2022-02-15 | 5 | 10 | -5 |
I've been able to write a join that combines the Orders and Products via the order_product
table, within the 60-day window, which is seemingly the easy part:
SELECT
"products"."id",
"products"."name",
"order_product"."unit_price",
"orders"."order_date"
FROM
products
JOIN order_product ON products.id = order_product.product_id
JOIN orders ON order_product.order_id = orders.id
WHERE
order_date BETWEEN now() - INTERVAL '60 days'
AND now()
I've been trying to work with RANK()
and LAG()
; however, where I'm getting stuck is being able to find the rank the rows within the 30-day time windows, and then calculate the variance between the two windows.
Any help would be much appreciated!
Building off of the answer by D-Shih , I had to tweak this to work based on the time window starting from the current date:
WITH CTE AS (
SELECT
"products"."id",
"products"."name",
"order_product"."unit_price",
"orders"."order_date"
FROM
products
JOIN order_product ON products.id = order_product.product_id
JOIN orders ON order_product.order_id = orders.id
WHERE
order_date BETWEEN now() - INTERVAL '60 days' AND now()
),
CTE2 AS (
SELECT
*,
EXTRACT(DAYS FROM now() - order_date :: timestamp) gap_days
FROM
CTE
),
CTE3 AS (
SELECT
*,
(CASE WHEN gap_days < 30 THEN 1 ELSE 0 END) grp
FROM
CTE2
)
SELECT
id,
name,
MAX(CASE WHEN grp = 1 THEN order_date END) order_date,
MAX(CASE WHEN grp = 1 THEN unit_price END) latest_unit_price,
MAX(CASE WHEN grp = 0 THEN unit_price END) previous_unit_price,
SUM(CASE WHEN grp = 1 THEN unit_price ELSE - unit_price END) variance
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ID, grp ORDER BY order_date DESC) rn
FROM
CTE3
) t1
WHERE
rn = 1
GROUP BY
id,
name
HAVING
MAX(CASE WHEN grp = 1 THEN unit_price END) <> MAX(CASE WHEN grp = 0 THEN unit_price END)
You can try to use EXTRACT
with LAG
window function to get days difference from order_date
and previous order_date
each productId
.
Then use SUM
aggregate condition window function to calculate the group
grp = 0
within the last 30 daysgrp = 1
most recent price in the previous 30 days, the query would be look like as below.
WITH CTE AS (
SELECT "products"."id",
"products"."name",
"order_product"."unit_price",
"orders"."order_date"
FROM
products
JOIN order_product ON products.id = order_product.product_id
JOIN orders ON order_product.order_id = orders.id
WHERE
order_date BETWEEN now() - INTERVAL '60 days'
AND now()
), CTE2 AS (
SELECT *,EXTRACT(DAYS FROM order_date - LAG(order_date,1,order_date) OVER(PARTITION BY id ORDER BY order_date)) gap_seconds
FROM CTE
), CTE3 AS (
SELECT *,(CASE WHEN SUM(gap_seconds) OVER(PARTITION BY id ORDER BY order_date) > 30 THEN 1 ELSE 0 END) grp
FROM CTE2
)
SELECT id,
name,
MAX(CASE WHEN grp = 1 THEN order_date END) order_date,
MAX(CASE WHEN grp = 1 THEN unit_price END) latest_unit_price,
MAX(CASE WHEN grp = 0 THEN unit_price END) previous_unit_price,
SUM(CASE WHEN grp = 1 THEN unit_price ELSE - unit_price END) variance
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY ID,grp ORDER BY order_date DESC) rn
FROM CTE3
) t1
WHERE rn = 1
GROUP BY id,
name
HAVING MAX(CASE WHEN grp = 1 THEN unit_price END) <> MAX(CASE WHEN grp = 0 THEN unit_price END)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.