简体   繁体   中英

how to rewrite the query to make it faster and sql compliant

This query:

SELECT `subscribers`.`email_address`, `subscribers`.`first_name`, `subscribers`.`last_name`,
        GROUP_CONCAT( DISTINCT t1.value SEPARATOR '|' ) AS 'Colors', GROUP_CONCAT( DISTINCT t2.value SEPARATOR '|' ) AS 'Languages'
FROM `subscribers`
LEFT JOIN `subscribers_multivalued` AS `t1` ON subscribers.subscriber_id = t1.subscriber_id AND t1.field_id = 112
LEFT JOIN `subscribers_multivalued` AS `t2` ON subscribers.subscriber_id = t2.subscriber_id AND t2.field_id = 111
WHERE (list_id = 40) AND (state = 1)
GROUP BY `subscribers`.`email_address` , `subscribers`.`first_name` , `subscribers`.`last_name`

With execute plan:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  subscribers     ref     FK_list_id,state_date_added     FK_list_id  4   const   1753610     Using where; Using filesort
1   SIMPLE  t1  ref     subscriber_fk,field_fk  subscriber_fk   4   chad0598_mailablel.subscribers.subscriber_id    1    
1   SIMPLE  t2  ref     subscriber_fk,field_fk  subscriber_fk   4   chad0598_mailablel.subscribers.subscriber_id    1    

Caused error for uknown reason: General error: 3 Error writing file '/tmp/MYzamaNT' (Errcode: 28) in mysql.

I rewrote it like this groupy by subscriber_id insead:

SELECT `subscribers`.`email_address`, `subscribers`.`first_name`, `subscribers`.`last_name`,
        GROUP_CONCAT( DISTINCT t1.value SEPARATOR '|' ) AS 'Colors', GROUP_CONCAT( DISTINCT t2.value SEPARATOR '|' ) AS 'Languages'
FROM `subscribers`
LEFT JOIN `subscribers_multivalued` AS `t1` ON subscribers.subscriber_id = t1.subscriber_id AND t1.field_id = 112
LEFT JOIN `subscribers_multivalued` AS `t2` ON subscribers.subscriber_id = t2.subscriber_id AND t2.field_id = 111
WHERE (list_id = 40) AND (state = 1)
GROUP BY `subscribers`.`subscriber_id`

The plan of this query is better (not filesort):

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  subscribers     ref     FK_list_id,state_date_added     FK_list_id  4   const   1753610     Using where
1   SIMPLE  t1  ref     subscriber_fk,field_fk  subscriber_fk   4   chad0598_mailablel.subscribers.subscriber_id    1    
1   SIMPLE  t2  ref     subscriber_fk,field_fk  subscriber_fk   4   chad0598_mailablel.subscribers.subscriber_id    1   

It seems that it works in mysql and it much faster that previous query. Although it works in mysql it's not compliant sql standard there shouldn't be aggregated fields in fields' list that are not present in GROUP BY. I there a way to make the query as fast as the last one and make it sql compliant?

I tried to rewrite it this way too:

SELECT `subscribers`.`email_address`, `subscribers`.`first_name`, `subscribers`.`last_name`,
        t1.colors AS 'Colors', t2.languages AS 'Languages'
FROM `subscribers`
LEFT JOIN (SELECT subscriber_id, GROUP_CONCAT( DISTINCT value SEPARATOR '|' ) AS 'colors'
           FROM subscribers_multivalued
           WHERE field_id = 112
           GROUP BY subscriber_id)  t1 ON t1.subscriber_id = `subscribers`.`subscriber_id`
LEFT JOIN (SELECT subscriber_id, GROUP_CONCAT( DISTINCT value SEPARATOR '|' ) AS 'languages'
           FROM subscribers_multivalued
           WHERE field_id = 111
           GROUP BY subscriber_id)  t2 ON t2.subscriber_id = `subscribers`.`subscriber_id`         

The plan of this query is much worse:

1   PRIMARY     subscribers     ALL     NULL    NULL    NULL    NULL    23358546     
1   PRIMARY     <derived2>  ALL     NULL    NULL    NULL    NULL    900000   
1   PRIMARY     <derived3>  ALL     NULL    NULL    NULL    NULL    900000   
3   DERIVED     subscribers_multivalued     ALL     field_fk    field_fk    4       20621115    Using filesort
2   DERIVED     subscribers_multivalued     ALL     field_fk    field_fk    4       20621115    Using filesort

and I couldn't wait untill it returns data;

I'd keep the faster one. You can also test/benchmark this variation:

SELECT s.email_address, 
       s.first_name, 
       s.last_name,
       COALESCE(t1.Colors, '') AS Colors,
       COALESCE(t2.Languages, '') AS Languages
FROM subscribers AS s
  LEFT JOIN 
    ( SELECT t1.subscriber_id, 
          GROUP_CONCAT( DISTINCT t1.value SEPARATOR '|' ) AS Colors
      FROM subscribers AS s
        JOIN subscribers_multivalued AS t1
          ON s.subscriber_id = t1.subscriber_id 
      WHERE t1.field_id = 112
        AND s.list_id = 40 
        AND s.state = 1
      GROUP BY t1.subscriber_id 
    ) AS t1
    ON s.subscriber_id = t1.subscriber_id 
  LEFT JOIN 
    ( SELECT t2.subscriber_id, 
             GROUP_CONCAT( DISTINCT t2.value SEPARATOR '|' ) AS Languages
      FROM subscribers AS s
        JOIN subscribers_multivalued AS t2
          ON s.subscriber_id = t2.subscriber_id 
      WHERE t2.field_id = 111
        AND s.list_id = 40 
        AND s.state = 1
      GROUP BY t2.subscriber_id 
    ) AS t2
    ON s.subscriber_id = t2.subscriber_id 
WHERE s.list_id = 40 
  AND s.state = 1 ;

If (field_id, subscriber_id, value) is unique in table subscribers_multivalued , you can also drop the two DISTINCT .


Regarding speed and efficiency, check the indexes you have. These would help both your version and this one:

  • in the subscribers table, an index on either (list_id, state, subscriber_id) or (state, list_id, subscriber_id)

  • in the subscribers_multivalued table, an index on (field_id, subscriber_id, value) or (second best) on (field_id, subscriber_id) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM