简体   繁体   中英

distinct or group?which is more efficient in mysql?

Here is my trial:

mysql> select a.id from iask a
    ->                                  join ianswer b on a.id=b.iaskid
    ->                                  join users c on c.id=a.uid
    ->                          where (c.last_check is null or a.created>c.last_check) and c.id=1
    ->                          group by a.id;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.01 sec)

mysql> select distinct a.id from iask a
    ->                                  join ianswer b on a.id=b.iaskid
    ->                                  join users c on c.id=a.uid
    ->                          where (c.last_check is null or a.created>c.last_check) and c.id=1;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

mysql> explain extended select distinct a.id from iask a
    ->                          join ianswer b on a.id=b.iaskid
    ->                          join users c on c.id=a.uid
    ->                  where (c.last_check is null or a.created>c.last_check) and c.id=1;
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
| id | select_type | table | type  | possible_keys             | key              | key_len | ref      | rows | filtered | Extra                 |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
|  1 | SIMPLE      | c     | const | PRIMARY,i_users_lastcheck | PRIMARY          | 4       | const    |    1 |   100.00 | Using temporary       |
|  1 | SIMPLE      | a     | ref   | PRIMARY,i_iask_uid        | i_iask_uid       | 4       | const    |    1 |   100.00 | Using where           |
|  1 | SIMPLE      | b     | ref   | i_ianswer_iaskid          | i_ianswer_iaskid | 4       | bbs.a.id |    7 |   100.00 | Using index; Distinct |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
3 rows in set, 1 warning (0.00 sec)

mysql> explain extended select a.id from iask a
    ->                          join ianswer b on a.id=b.iaskid
    ->                          join users c on c.id=a.uid
    ->                  where (c.last_check is null or a.created>c.last_check) and c.id=1
    ->                  group by a.id;
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
| id | select_type | table | type  | possible_keys             | key              | key_len | ref      | rows | filtered | Extra                           |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
|  1 | SIMPLE      | c     | const | PRIMARY,i_users_lastcheck | PRIMARY          | 4       | const    |    1 |   100.00 | Using temporary; Using filesort |
|  1 | SIMPLE      | a     | ref   | PRIMARY,i_iask_uid        | i_iask_uid       | 4       | const    |    1 |   100.00 | Using where                     |
|  1 | SIMPLE      | b     | ref   | i_ianswer_iaskid          | i_ianswer_iaskid | 4       | bbs.a.id |    7 |   100.00 | Using index                     |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
3 rows in set, 1 warning (0.00 sec)

They are two close but not exactly identical concepts. On most queries (including the queries you provided) they are identical, I don't believe there will be a difference and will be optimized identically (though you could do explain extended if you want to see if there was any differences) (see here on optimizing distinct and group by ).

So in short, most likely the same, unless there is a difference in result set, and then you should be using the one to give you the proper result set anyway, so it doesn't matter.

As Alex has mentioned, without knowing your table structure it's difficult to comment.

However, DISTINCT and GROUP BY are used for very different purposes. (A good SQL reference should explain the difference between the two.) Attempting to determine which one is more efficient is meaningless.

If I am interpreting your query correctly, you're trying to return the list of new "question" IDs since user 1 did something to change the value of last_check.

If I had created your tables, a.id would be the primary key of the iask table -- hence, every value of a.id would be unique. Therefore, neither DISTINCT nor GROUP BY have an effect on your queries because

  • There is only ever one row with the value of a.id (DISTINCT not needed); and

  • There is nothing to GROUP BY since no two rows can have the same value of a.id (GROUP BY not needed).

The only meaningful query in this circumstance is just

select a.id
from   iask a
       join ianswer b on a.id = b.iaskid
       join users c on c.id = a.uid
where  (c.last_check is null or a.created > c.last_check)
       and c.id = 1;

You're posting the EXPLAIN SELECT DISTINCT , but not the EXPLAIN SELECT ... GROUP BY (and without seeing exactly how your table and indices are we can't do it for you;-). I'd expect the two to be identical, meaning no difference in performance; if there are different (and DB optimizers are quirky, especially MySQL's, so it's always best to check than to guess;-) they should indicate pretty clearly which approach is going to provide better performance.

PS: you should do your EXPLAIN runs on (possibly fake) tables that are reasonably representative of real life -- the optimizer may decide for a full table scan rather than an index because it sees too little diversity along that index on toy data, but it would use the index on real data with more diversity, for example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM