简体   繁体   中英

Can this MySQL Query be optimized?

I currently try to optimize a MySQL query which runs a little slow on tables with 10,000+ rows.

CREATE TABLE IF NOT EXISTS `person` (
  `_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `_oid` char(8) NOT NULL,
  `firstname` varchar(255) NOT NULL,
  `lastname` varchar(255) NOT NULL,
  PRIMARY KEY (`_id`),
  KEY `_oid` (`_oid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `person_cars` (
  `_id` int(11) NOT NULL AUTO_INCREMENT,
  `_oid` char(8) NOT NULL,
  `idx` varchar(255) NOT NULL,
  `val` blob NOT NULL,
  PRIMARY KEY (`_id`),
  KEY `_oid` (`_oid`),
  KEY `idx` (`idx`),
  KEY `val` (`val`(64))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

# Insert some 10000+ rows…

INSERT INTO `person` (`_oid`,`firstname`,`lastname`)
VALUES
    ('1', 'John', 'Doe'),
    ('2', 'Jack', 'Black'),
    ('3', 'Jim', 'Kirk'),
    ('4', 'Forrest', 'Gump');

INSERT INTO `person_cars` (`_oid`,`idx`,`val`)
VALUES
    ('1', '0', 'BMW'),
    ('1', '1', 'PORSCHE'),
    ('2', '0', 'BMW'),
    ('3', '1', 'MERCEDES'),
    ('3', '0', 'TOYOTA'),
    ('3', '1', 'NISSAN'),
    ('4', '0', 'OLDMOBILE');


SELECT `_person`.`_oid`,
       `_person`.`firstname`,
       `_person`.`lastname`,
       `_person_cars`.`cars[0]`,
       `_person_cars`.`cars[1]`

FROM `person` `_person` 

LEFT JOIN (

   SELECT `_person`.`_oid`,
          IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`,
          IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]`
   FROM `person` `_person` 
   JOIN `person_cars` `_person_cars` ON `_person`.`_oid` = `_person_cars`.`_oid`
   GROUP BY `_person`.`_oid`

) `_person_cars` ON `_person_cars`.`_oid` = `_person`.`_oid` 

WHERE `cars[0]` = 'BMW' OR `cars[1]` = 'BMW';

The above SELECT query takes ~170ms on my virtual machine running MySQL 5.1.53. with approx. 10,000 rows in each of the two tables.

When I EXPLAIN the above query, results differ depending on how many rows are in each table:

+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
| id | select_type | table        | type  | possible_keys | key  | key_len | ref  | rows | Extra                                       |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
|  1 | PRIMARY     | <derived2>   | ALL   | NULL          | NULL | NULL    | NULL |    4 | Using where                                 |
|  1 | PRIMARY     | _person      | ALL   | _oid          | NULL | NULL    | NULL |    4 | Using where; Using join buffer              |
|  2 | DERIVED     | _person_cars | ALL   | _oid          | NULL | NULL    | NULL |    7 | Using temporary; Using filesort             |
|  2 | DERIVED     | _person      | index | _oid          | _oid | 24      | NULL |    4 | Using where; Using index; Using join buffer |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+

Some 10,000 rows give the following result:

+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref                    | rows | Extra                           |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
|  1 | PRIMARY     | <derived2>   | ALL  | NULL          | NULL | NULL    | NULL                   | 6613 | Using where                     |
|  1 | PRIMARY     | _person      | ref  | _oid          | _oid | 24      | _person_cars._oid      |   10 |                                 |
|  2 | DERIVED     | _person_cars | ALL  | _oid          | NULL | NULL    | NULL                   | 9913 | Using temporary; Using filesort |
|  2 | DERIVED     | _person      | ref  | _oid          | _oid | 24      | test._person_cars._oid |   10 | Using index                     |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+

Things get worse when I leave out the WHERE clause or when I LEFT JOIN another table similar to person_cars .

Does anyone have an idea how to optimize the SELECT query to make things a little faster?

It's slow because this will force three full table scans on persons that then get joined together:

LEFT JOIN (
  ...
  GROUP BY `_person`.`_oid` -- the group by here
) `_person_cars` ...

WHERE ... -- and the where clauses on _person_cars.

Considering the where clauses the left join is really an inner join, for one. And you could shove the conditions before the join with persons actually occurs. That join is also needlessly applied twice.

This will make it faster, but if you've an order by/limit clause it will still lead to a full table scan on persons (ie still not good) because of the group by in the subquery:

JOIN (
SELECT `_person_cars`.`_oid`,
          IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`,
          IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]`
   FROM `person_cars`
   GROUP BY `_person_cars`.`_oid`
   HAVING IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) = 'BMW' OR
          IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) = 'BMW'
) `_person_cars` ... -- smaller number of rows

If you apply an order by/limit, you'll get better results with two queries, ie:

SELECT `_person`.`_oid`,
       `_person`.`firstname`,
       `_person`.`lastname`
FROM `_person`
JOIN `_person_cars`
ON `_person_cars`.`_oid` = `_person`.`_oid`
AND `_person_cars`.`val` = 'BMW'
GROUP BY -- pre-sort the result before grouping, so as to not do the work twice
         `_person`.`lastname`,
         `_person`.`firstname`,
         -- eliminate users with multiple BMWs
         `_person`.`_oid`
ORDER BY `_person`.`lastname`,
         `_person`.`firstname`,
         `_person`.`_oid`
LIMIT 10

And then select the cars with an IN () clause using the resulting ids.

Oh, and your vals column probably should be a varchar.

Check This

SELECT
  p._oid      AS oid,
  p.firstname AS firstname,
  p.lastname  AS lastname,
  pc.val      AS car1,
  pc2.val     AS car2
FROM person AS p
  LEFT JOIN person_cars AS pc
    ON pc._oid = p._oid
      AND pc.idx = 0
  LEFT JOIN person_cars AS pc2
    ON pc2._oid = p._oid
      AND pc2.idx = 1
WHERE pc.val = 'BMW'
     OR pc2.val = 'BWM'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM