I have this table scheme (removed non-important columns):
and I am programming REST API on that. I need to get paginated products sold within some date range.
This wouldn't be a problem for me, but I also need to sort them by either product's code or its sold quantity. The latter is a problem for me.
My idea was to query products and then use subquery to find their sold_products
filtered by date and sum the resulting sold quantities. Then order by that sum.
This works but this really is not efficient. For every single product I have to sum it's sold products and in the end I only take 10 results (because I'm using pagination). My latest attempt using this way took 6.5 seconds, so...
My second idea was to go from the other side. Query sold_products, group them by product_id
, filter them and then find it's parents.
This works well, this would be exactly it, but there is one problem - I don't get products which were not sold yet.
How can I do this? I think I won't need exact query, just some idea how should I approach this should be enough.
Thank you in advance!
EDIT:
Input data in CSV (hope it's ok like this):
product:
id | code |
---|---|
1 | A11 |
2 | A12 |
3 | B11 |
product_variant:
id | product_id | code | initial_quantity |
---|---|---|---|
1 | 1 | A11-1 | 50 |
2 | 1 | A11-2 | 50 |
3 | 1 | A11-3 | 80 |
4 | 2 | A12-1 | 20 |
5 | 2 | A12-2 | 30 |
6 | 2 | A12-3 | 80 |
7 | 2 | A12-4 | 90 |
8 | 3 | B11-1 | 70 |
9 | 3 | B11-2 | 70 |
sold_product:
id | product_id | product_variant | quantity | date |
---|---|---|---|---|
1 | 1 | 1 | 20 | 2021-04-01 |
2 | 1 | 1 | 15 | 2021-04-01 |
3 | 1 | 2 | 15 | 2021-04-04 |
4 | 1 | 3 | 10 | 2021-04-05 |
5 | 1 | 3 | 19 | 2021-04-07 |
6 | 2 | 4 | 11 | 2021-04-07 |
7 | 2 | 5 | 12 | 2021-04-08 |
8 | 2 | 7 | 15 | 2021-04-10 |
9 | 2 | 7 | 15 | 2021-04-10 |
Result:
product_id | product_code | initial_quantity | sold_quantity |
---|---|---|---|
1 | A11 | 180 | 79 |
2 | A12 | 220 | 53 |
3 | B11 | 140 | 0 or NULL |
Result when sold date range 2021-04-07 to 2021-04-08 and ordered by sold_quantity asc:
product_id | product_code | initial_quantity | sold_quantity |
---|---|---|---|
3 | B11 | 140 | 0 or NULL |
1 | A11 | 180 | 19 |
2 | A12 | 220 | 23 |
Sample data SQL:
CREATE TABLE product(id INT, code VARCHAR(25));
CREATE TABLE product_variant(id INT, product_id INT, code VARCHAR(25), initial_quantity INT);
CREATE TABLE sold_product(id INT, product_id INT, product_variant_id INT, quantity INT, date_time DATETIME);
INSERT INTO product VALUES(1,'A11');
INSERT INTO product VALUES(2,'A12');
INSERT INTO product VALUES(3,'B11');
INSERT INTO product_variant VALUES(1,1,'A11-1',50);
INSERT INTO product_variant VALUES(2,1,'A11-2',50);
INSERT INTO product_variant VALUES(3,1,'A11-3',80);
INSERT INTO product_variant VALUES(4,2,'A12-1',20);
INSERT INTO product_variant VALUES(5,2,'A12-2',30);
INSERT INTO product_variant VALUES(6,2,'A12-3',80);
INSERT INTO product_variant VALUES(7,2,'A12-4',90);
INSERT INTO product_variant VALUES(8,3,'B11-1',70);
INSERT INTO product_variant VALUES(9,3,'B11-2',70);
INSERT INTO sold_product VALUES(1,1,1,20,'2021-04-01');
INSERT INTO sold_product VALUES(2,1,1,15,'2021-04-01');
INSERT INTO sold_product VALUES(3,1,2,15,'2021-04-04');
INSERT INTO sold_product VALUES(4,1,3,10,'2021-04-05');
INSERT INTO sold_product VALUES(5,1,3,19,'2021-04-07');
INSERT INTO sold_product VALUES(6,2,4,11,'2021-04-07');
INSERT INTO sold_product VALUES(7,2,5,12,'2021-04-08');
INSERT INTO sold_product VALUES(8,2,7,15,'2021-04-10');
INSERT INTO sold_product VALUES(9,2,7,15,'2021-04-10');
First idea:
SELECT p.id, p.code,
(
SELECT SUM(v.initial_quantity)
FROM product_variant AS v
WHERE p.id = v.product_id
) AS initial_quantity,
(
SELECT SUM(s.quantity)
FROM sold_product AS s
WHERE p.id = s.product_id
) AS sold_quantity
FROM product AS p
ORDER BY sold_quantity;
Second idea:
SELECT p.id, p.code,
(
SELECT SUM(v.initial_quantity)
FROM product_variant AS v
WHERE p.id = v.product_id
) AS initial_quantity,
SUM(s.quantity) AS sold_quantity
FROM sold_product AS s
INNER JOIN product AS p ON s.product_id = p.id
GROUP BY p.code, p.id,
(
SELECT SUM(v.initial_quantity)
FROM product_variant AS v
WHERE p.id = v.product_id
)
ORDER BY sold_quantity;
EDIT: JOIN
attempt, returns single result with sum of all sold quantities
SELECT p.id, p.code,
(
SELECT SUM(v.initial_quantity)
FROM product_variant AS v
WHERE p.id = v.product_id
) AS initial_quantity,
SUM(s.quantity) AS sold_quantity
FROM product AS p
JOIN sold_product AS s ON p.id = s.product_id
ORDER BY sold_quantity;
Solved - missing index key on sold_product.product_id
. I never thought missing index key could slow queries that much , up to 30 seconds. With that index key it takes 0.1s.
Thank you all who tried to help!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.