I want to sum up orders. There are products p and ordered items i like:
DROP TABLE IF EXISTS p;
CREATE TABLE p (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`combine` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `combine`(`combine`)
) ENGINE=InnoDB;
DROP TABLE IF EXISTS i;
CREATE TABLE i (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`p` int(10) unsigned DEFAULT NULL,
`quantity` decimal(15,2) NOT NULL,
PRIMARY KEY (`id`),
INDEX `p`(`p`)
) ENGINE=InnoDB;
INSERT INTO p SET id=1, combine=NULL;
INSERT INTO p SET id=2, combine=1;
INSERT INTO p SET id=3, combine=1;
INSERT INTO p SET id=4, combine=NULL;
INSERT INTO i SET id=1, p=1, quantity=5;
INSERT INTO i SET id=2, p=1, quantity=2;
INSERT INTO i SET id=3, p=2, quantity=1;
INSERT INTO i SET id=4, p=3, quantity=4;
INSERT INTO i SET id=5, p=4, quantity=2;
INSERT INTO i SET id=6, p=4, quantity=1;
The idea is that products may be combined which means all sales are combined for these products. This means for example that products 1, 2 and 3 should have the same result: All sales of these products summed up. So I do:
SELECT p.id, SUM(i.quantity)
FROM p
LEFT JOIN p AS p_all ON (p_all.id = p.id OR p_all.combine=p.combine OR p_all.id = p.combine OR p_all.combine = p.id)
LEFT JOIN i ON i.p = p_all.id
GROUP BY p.id;
which gives the required result:
p=1: 12 (i: 1, 2, 3, 4 added)
p=2: 12 (i: 1, 2, 3, 4 added)
p=3: 12 (i: 1, 2, 3, 4 added)
p=4: 3 (i: 5, 6 added)
My problem is that on the real data the OR in the JOIN of the products for p_combine make the query very slow. Just querying without the combination takes 0.2 sec, while the OR makes it last for more than 30 sec.
How could I make this query more efficient in MySql?
Added: There are some more constraints on the real query like:
SELECT p.id, SUM(i.quantity)
FROM p
LEFT JOIN p AS p_all ON (p_all.id = p.id OR p_all.combine=p.combine OR p_all.id = p.combine OR p_all.combine = p.id)
LEFT JOIN i ON i.p = p_all.id
LEFT JOIN orders o ON o.id = i.order
WHERE o.ordered <= '2018-05-10'
AND i.flag=false
AND ...
GROUP BY p.id;
Added: EXPLAIN on real data:
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
| 1 | SIMPLE | p | NULL | index | PRIMARY,...combine... | PRIMARY | 4 | NULL | 6556 | 100.00 | NULL |
| 1 | SIMPLE | p_all | NULL | ALL | PRIMARY,combine | NULL | NULL | NULL | 6556 | 100.00 | Range checked for each record (index map: 0x41) |
| 1 | SIMPLE | p | NULL | ref | p | p | 5 | p_all.id | 43 | 100.00 | NULL |
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
I don't know if you have the flexibility to do this, but you could speed it up by changing the combine field in p:
UPDATE p SET combine=id WHERE combine IS NULL;
Then you can massively simplify the ON condition to:
ON p_all.combine = p.combine
making the query ( SQLFiddle ):
SELECT p.id, SUM(i.quantity) AS qty
FROM p
JOIN p AS p_all
ON p_all.combine = p.combine
JOIN i
ON i.p = p_all.id
GROUP BY p.id
Output:
id qty
1 12
2 12
3 12
4 3
Using subqueries can sometimes be faster than joins.
eg
Select p.id, (Select sum(quantity) from i where p in
(Select id from p as p2 where
p2.id = p.id or
p2.combine=p.id or
p2.id = p.combine or
p2.combine = p.combine)
) as orders
from p
You could add all of your constraints on i
inside the 'orders' subquery
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.