简体   繁体   中英

GROUP_CONCAT with FIND_IN_SET, multiple joins

I want to retrieve items which have certain filters set. For example list items which are red or blue and small should return only the item apple. ((red(2) or blue(4)) and small(5)) => apple

I have found 2 solutions, but both seem to me overly complex. The first solution seems to me more elegant, as when I want to add another filter by AND , its quite simple. While the second solution will require another JOIN . I hope I am overlooking something and there is a much better solution then this.

The questions,

  1. is there a better solution?
  2. if there is no better solution - which one is faster/recommended?

item table

| id | itemname |
├────┼──────────┤
| 1  | apple    |
| 2  | orange   |
| 3  | banana   |
| 4  | melon    |

filter table

│ id │ filtername │
├────┼────────────┤
│ 1  │ orange     │
│ 2  │ red        │
│ 3  │ green      │
│ 4  │ blue       │
│ 5  │ small      │
│ 6  │ medium     │
│ 7  │ big        │
│ 8  │ yellow     │

item_filter

│ item_id │ filter_id │
├─────────┼───────────┤
│ 1       │ 2         │
│ 1       │ 3         │
│ 1       │ 5         │
│ 2       │ 1         │
│ 2       │ 5         │
│ 3       │ 6         │
│ 3       │ 8         │
│ 4       │ 3         │
│ 4       │ 7         │

First solution based on GROUP_CONCAT and FIND_IN_SET

sqlfiddle: http://sqlfiddle.com/#!9/26f99/1/0

SELECT * FROM item
JOIN (
    SELECT item_id, GROUP_CONCAT(filter_id) AS filters
    FROM item_filter
    GROUP BY item_id
) AS grp ON grp.item_id = item.id
WHERE (FIND_IN_SET(2,filters) OR FIND_IN_SET(4,filters)) AND FIND_IN_SET(5, filters)

Second solution based on JOIN and where clause only

sqlfiddle: http://sqlfiddle.com/#!9/f0b95/1/0

SELECT itemname FROM item
JOIN item_filter as filter1 on item.id=filter1.item_id
JOIN item_filter as filter2 on item.id=filter2.item_id
WHERE (filter1.filter_id=2 or filter1.filter_id=4) and filter2.filter_id=5

I am no MySQL expert but here is my two cents.

You should use the MySQL EXPLAIN function to get details about how the query would be executed: http://dev.mysql.com/doc/refman/5.7/en/explain-output.html

But before that, you should add a composite key index your relation table, that is: the item_filter table; the EXPLAIN result would not be relevant without that done, as the latter table would be fully scanned for each query.

Now, running explain on both your queries, you will notice your second solution is obviously the best from a performance stand point (and assuming you added the index to the EXPLAIN table) :

mysql> EXPLAIN SELECT * FROM item
    -> JOIN (
    ->     SELECT item_id, GROUP_CONCAT(filter_id) AS filters
    ->     FROM item_filter
    ->     GROUP BY item_id
    -> ) AS grp ON grp.item_id = item.id
    -> WHERE (FIND_IN_SET(2,filters) OR FIND_IN_SET(4,filters)) AND FIND_IN_SET(5, filters);
+----+-------------+-------------+-------+---------------+---------+---------+------+------+--------------------------------+
| id | select_type | table       | type  | possible_keys | key     | key_len | ref  | rows | Extra                          |
+----+-------------+-------------+-------+---------------+---------+---------+------+------+--------------------------------+
|  1 | PRIMARY     | <derived2>  | ALL   | NULL          | NULL    | NULL    | NULL |    4 | Using where                    |
|  1 | PRIMARY     | item        | ALL   | PRIMARY       | NULL    | NULL    | NULL |    4 | Using where; Using join buffer |
|  2 | DERIVED     | item_filter | index | NULL          | PRIMARY | 8       | NULL |    9 | Using index                    |
+----+-------------+-------------+-------+---------------+---------+---------+------+------+--------------------------------+
3 rows in set (0.00 sec)

mysql> EXPLAIN SELECT itemname FROM item
    -> JOIN item_filter as filter1 on item.id=filter1.item_id
    -> JOIN item_filter as filter2 on item.id=filter2.item_id
    -> WHERE (filter1.filter_id=2 or filter1.filter_id=4) and filter2.filter_id=5;
+----+-------------+---------+--------+---------------+---------+---------+--------------------+------+--------------------------+
| id | select_type | table   | type   | possible_keys | key     | key_len | ref                | rows | Extra                    |
+----+-------------+---------+--------+---------------+---------+---------+--------------------+------+--------------------------+
|  1 | SIMPLE      | item    | ALL    | PRIMARY       | NULL    | NULL    | NULL               |    4 |                          |
|  1 | SIMPLE      | filter1 | ref    | PRIMARY       | PRIMARY | 4       | test.item.id       |    1 | Using where; Using index |
|  1 | SIMPLE      | filter2 | eq_ref | PRIMARY       | PRIMARY | 8       | test.item.id,const |    1 | Using index              |
+----+-------------+---------+--------+---------------+---------+---------+--------------------+------+--------------------------+
3 rows in set (0.01 sec)

mysql>

Without going into details:

  • Solution one performs two table full scans, an index lookup and reads 17 rows (plus I am not convinced about the GROUP_CONCAT and FIND_IN_SET performance impact).

  • Solution two performs a single table full scan and reads only 6 rows in total.

Check the EXPLAIN Join Types documentation for more information: http://dev.mysql.com/doc/refman/5.7/en/explain-output.html#explain-join-types

The first solution is not going to usefully use indexes. The sub query will use indexes and return a lot of records, but those records will be check without indexes.

For example if you had 10000 rows on the item table the sub query is going to return 1000 rows. For each of those 10000 rows the database is going to have to use a function to check for the filters. As it is the result of a sub query it will not use the indexes (and further, FIND_IN_SET won't use indexes).

The 2nd solution should be far quicker (but as you say, less flexible with adding new filters. Note that you probably would want an index on the item_filter table covering both item_id and filter id (and probably a 2nd index just on the filter_id column).

I expect MySQL would execute this as:-

SELECT itemname 
FROM item_filter as filter2 
JOIN item_filter as filter1 on filter2.id = filter1.item_id
JOIN FROM item on item.id = filter1 .item_id
WHERE (filter1.filter_id=2 or filter1.filter_id=4) and filter2.filter_id=5

as this way it can use the most exclusive index first, join that to the 2nd filter (using the index on item_id narrowed down by the checks for filter 2 and 4) and then joining item based on the item_id (which I would hope is the primary key).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM