简体   繁体   中英

How to make use of index on LEFT OUTER JOIN

Here is a set of tables describing music composers :

CREATE TABLE IF NOT EXISTS `compositors` (
`id` int(11) NOT NULL,
  `name` varchar(45) NOT NULL COMMENT 'Nom et Prenom',
  `birth_date` varchar(45) DEFAULT NULL,
  `death_date` varchar(45) DEFAULT NULL,
  `birth_place` varchar(45) DEFAULT NULL,
  `death_place` varchar(45) DEFAULT NULL,
  `gender` enum('M','F') DEFAULT NULL,
  `century` varchar(45) DEFAULT NULL,
  `country` int(11) DEFAULT NULL
) ENGINE=InnoDB AUTO_INCREMENT=28741 DEFAULT CHARSET=latin1;


CREATE TABLE IF NOT EXISTS `compositor_biography` (
`index` int(11) NOT NULL,
  `compositor_id` int(11) NOT NULL,
  `url` varchar(255) DEFAULT NULL
) ENGINE=InnoDB AUTO_INCREMENT=15325 DEFAULT CHARSET=latin1;


CREATE TABLE IF NOT EXISTS `compositor_comments` (
  `compositor_id` int(11) NOT NULL,
  `comment` text NOT NULL,
  `public` enum('Publique','Privé') NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;


CREATE TABLE IF NOT EXISTS `compositor_country` (
  `compositor_id` int(11) NOT NULL,
  `country_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Here are my indexes :

--
-- Index pour la table `compositors`
--
ALTER TABLE `compositors` ADD PRIMARY KEY (`id`), ADD KEY `countries` (`country`);
ALTER TABLE `compositor_biography` ADD PRIMARY KEY (`index`), ADD KEY `index` (`compositor_id`);
ALTER TABLE `compositor_comments` ADD KEY `c_compositor_idx` (`compositor_id`);

And finally sample data :

INSERT INTO `compositors` (`id`, `name`, `birth_date`, `death_date`, `birth_place`, `death_place`, `gender`, `century`, `country`) VALUES
(1, 'Dummy Compositor', '1606', '1676', 'Bruxellesss', NULL, 'F', '17', 11);

INSERT INTO `compositor_biography` (`index`, `compositor_id`, `url`) VALUES
(15322, 1, 'Dummy Link 1'),
(15323, 1, 'Dummy Link 2'),
(15324, 1, 'Dummy Link 3');

INSERT INTO `compositor_comments` (`compositor_id`, `comment`, `public`) VALUES
(1, 'Dummy Comment', 'Privé');

Here is an example query that my PHP script generate :

SELECT DISTINCT compositors.id, compositors.name, compositors.birth_date, compositors.death_date, compositors.birth_place, compositors.death_place, compositors.gender, compositors.century, compositors.country,  
GROUP_CONCAT( compositor_biography.url SEPARATOR ';') AS concat_compositor_biography_url, 
GROUP_CONCAT( compositor_comments.comment SEPARATOR ';') AS concat_compositor_comments_comment, 
GROUP_CONCAT( compositor_comments.public + 0 SEPARATOR ';') AS concat_compositor_comments_public 
FROM compositors 
LEFT JOIN compositor_biography ON compositors.id = compositor_biography.compositor_id 
LEFT JOIN compositor_comments ON compositors.id = compositor_comments.compositor_id 
GROUP BY compositors.id

However, this one has a problem, if you execute this query, you can see that in the concat_compositor_comments_comment column, you have that result :

Dummy Comment;Dummy Comment;Dummy Comment

but there is only one actual comment.

I didn't really understand what was the problem there, but it seemed like it was the GROUP BY. It should have one GROUP BY per JOIN - according to the second answer at Multiple GROUP_CONCAT on different fields using MySQL -- so I did it, and it worked, with this query :

SELECT DISTINCT compositors.id,
    compositors.NAME,
    compositors.birth_date,
    compositors.death_date,
    compositors.birth_place,
    compositors.death_place,
    compositors.gender,
    compositors.century,
    compositors.country,
    concat_compositor_biography_url,
    concat_compositor_comments_comment,
    concat_compositor_comments_public
FROM compositors
LEFT JOIN (
    SELECT compositor_id,
        GROUP_CONCAT(compositor_biography.url SEPARATOR ';') AS concat_compositor_biography_url
    FROM compositor_biography
    GROUP BY compositor_biography.compositor_id
    ) compositor_biography ON compositors.id = compositor_biography.compositor_id
LEFT JOIN (
    SELECT compositor_id,
        GROUP_CONCAT(compositor_comments.comment SEPARATOR ';') AS concat_compositor_comments_comment,
        GROUP_CONCAT(compositor_comments.PUBLIC + 0 SEPARATOR ';') AS concat_compositor_comments_public
    FROM compositor_comments
    GROUP BY compositor_comments.compositor_id
    ) compositor_comments ON compositors.id = compositor_comments.compositor_id

However, this query has huge performance problem, since it doesn't use INDEXES, or at least it seems to scan all the tables, and with 24000 composers, it takes approx 420 second for that query, while the other ( which gives wrong results on GROUP BY ) takes 1 second.

How can I change the second query, so it uses correctly the index and doesn't scan all the tables ?

Here is a link to a SQL-Fiddle the database schema : http://sqlfiddle.com/#!2/6b0132


UPDATE

According to @phil_w, and after further testing, this query seems to work with very good performance :

SELECT a.id,
    a.name,
    a.concat_compositor_biography_url,
    b.concat_compositor_aliases_data,
    GROUP_CONCAT(compositor_comments.comment SEPARATOR ';') as concat_compositor_comments_comment,
    GROUP_CONCAT(compositor_comments.public + 0 SEPARATOR ';') as concat_compositor_comments_public
FROM (
    SELECT b.id,
    b.name,
    b.concat_compositor_biography_url,
    GROUP_CONCAT(compositor_aliases.data SEPARATOR ';') as concat_compositor_aliases_data
    FROM (
        SELECT compositors.id,
            compositors.name,
            GROUP_CONCAT(compositor_biography.url SEPARATOR ';') AS concat_compositor_biography_url
        FROM compositors
        LEFT JOIN compositor_biography ON compositors.id = compositor_biography.compositor_id
        GROUP BY compositors.id
    ) b
    LEFT JOIN compositor_aliases ON b.id = compositor_aliases.compositor_id
    GROUP BY b.id
) a
LEFT JOIN compositor_comments ON a.id = compositor_comments.compositor_id
GROUP BY a.id  

However, how would it be possible to have the same result in a more compact query ? ( by the way, shall I create a new question for that and make this one resolved ? )

This question has nothing to do with "indexes". The problem is that you have two joins and every combination of rows will be returned (ie you have 3 matching rows in the other join to compositor_biography ).

The fix is simple - just add DISTINCT to the GROUP_CONCAT() function:

...
GROUP_CONCAT( DISTINCT compositor_comments.comment SEPARATOR ';') AS concat_compositor_comments_comment, 
...

It's normal that you have the entry 3 times, because you have 3 rows in compositor_biography...

perhaps you could go step by step, first gathering the bio only:

  SELECT compositors.id, compositors.name, 
    GROUP_CONCAT( compositor_biography.url SEPARATOR ';') AS   concat_compositor_biography_url
    FROM compositors 
    LEFT JOIN compositor_biography ON compositors.id =   compositor_biography.compositor_id  
    GROUP BY compositors.id

then join the rest

select t.id, t.name,t.concat_compositor_biography_url,
GROUP_CONCAT( compositor_comments.comment SEPARATOR ';') AS concat_compositor_comments_comment
from (
  SELECT compositors.id, compositors.name, 
  GROUP_CONCAT( compositor_biography.url SEPARATOR ';') AS concat_compositor_biography_url
  FROM compositors 
  LEFT JOIN compositor_biography ON compositors.id = compositor_biography.compositor_id 
  GROUP BY compositors.id
) t
LEFT JOIN compositor_comments ON t.id = compositor_comments.compositor_id  

and so on...

I don't see why it would not use the index unless the table is small. Try 'explain select...' to confirm that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM