简体   繁体   中英

Efficient MySQL many-to-many query for tags

I am having a bit of a trouble to find an efficient way to select a row in the database based on its tags and also return all other tags associated with the row. When I use a query that does not return all tags of the row, it takes about 0.001 seconds. My initial scheme was more normalized and had another table for the labels of the tags but it ended up taking literally seconds for one query to be completed so I ended removing that table and have it less normalized but even this solutions just seems quite slow.

SELECT c.*
FROM collections c,
     tags t
WHERE t.collection_id=c.id
  AND (t.name IN ("foo",
                  "bar"))
GROUP BY c.id HAVING COUNT(t.id)=2 LIMIT 10

Now I fail to come up with an efficient way to also get all other tags for that element without getting to slow. My current solution is about 10 times slower and takes 0.01 seconds to be completed, also I have the feeling that it does not scale good (and I find it pretty ugly).

SELECT c.*,
       GROUP_CONCAT(t1.name) AS tags
FROM collections c,
     tags t,
     tags t1
WHERE t1.collection_id = c.id
  AND t.collection_id=c.id
  AND (t.name IN ("foo",
                  "bar"))
GROUP BY c.id HAVING COUNT(t.id)=2 LIMIT 10

Is there actually a efficient or atleast more efficient way to accomplish this? Would be really thankful for any advice or hint on this one!

OK. Consider the following...

DROP TABLE IF EXISTS ingredients;

CREATE TABLE ingredients 
(ingredient_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,ingredient VARCHAR(30) NOT NULL UNIQUE
);

INSERT INTO ingredients (ingredient_id, ingredient) VALUES
(1, 'Macaroni'),
(2, 'Cheese'),
(3, 'Beans'),
(4, 'Toast'),
(5, 'Jam'),
(6, 'Jacket Potato'),
(7, 'Peanut Butter');


DROP TABLE IF EXISTS recipes;

CREATE TABLE recipes 
(recipe_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,recipe VARCHAR(50) NOT NULL UNIQUE
);

INSERT INTO recipes (recipe_id, recipe) VALUES
(1, 'Macaroni & Cheese'),
(2, 'Cheese on Toast'),
(3, 'Beans on Toast'),
(4, 'Cheese & Beans on Toast'),
(5, 'Toast & Jam'),
(6, 'Beans & Macaroni'),
(9, 'Beans on Jacket Potato'),
(10, 'Cheese & Beans on Jacket Potato'),
(12, 'Peanut Butter on Toast');

DROP TABLE IF EXISTS recipe_ingredient;

CREATE TABLE recipe_ingredient 
(recipe_id INT NOT NULL
,ingredient_id INT NOT NULL
,PRIMARY KEY (recipe_id,ingredient_id)
);

INSERT INTO recipe_ingredient (recipe_id, ingredient_id) VALUES
(1, 1),
(1, 2),
(2, 2),
(2, 4),
(3, 3),
(3, 4),
(4, 2),
(4, 3),
(4, 4),
(5, 4),
(5, 5),
(6, 1),
(6, 3),
(9, 3),
(9, 6),
(10, 2),
(10, 3),
(10, 6),
(12, 4),
(12, 7);

SELECT r.*
      , GROUP_CONCAT(CASE WHEN i.ingredient IN ('Cheese','Beans') THEN i.ingredient END) i
      , GROUP_CONCAT(CASE WHEN i.ingredient NOT IN('Cheese','Beans') THEN i.ingredient END) o 
   FROM recipes r 
   LEFT 
   JOIN recipe_ingredient ri 
     ON ri.recipe_id = r.recipe_id 
   LEFT 
   JOIN ingredients i 
     ON i.ingredient_id = ri.ingredient_id 
  GROUP 
     BY recipe_id;

+-----------+---------------------------------+--------------+---------------------+
| recipe_id | recipe                          | i            | o                   |
+-----------+---------------------------------+--------------+---------------------+
|         1 | Macaroni & Cheese               | Cheese       | Macaroni            |
|         2 | Cheese on Toast                 | Cheese       | Toast               |
|         3 | Beans on Toast                  | Beans        | Toast               |
|         4 | Cheese & Beans on Toast         | Cheese,Beans | Toast               |
|         5 | Toast & Jam                     | NULL         | Toast,Jam           |
|         6 | Beans & Macaroni                | Beans        | Macaroni            |
|         9 | Beans on Jacket Potato          | Beans        | Jacket Potato       |
|        10 | Cheese & Beans on Jacket Potato | Cheese,Beans | Jacket Potato       |
|        12 | Peanut Butter on Toast          | NULL         | Toast,Peanut Butter |
+-----------+---------------------------------+--------------+---------------------+

Fiddle of same: http://www.sqlfiddle.com/#!2/45aa0/1

Putting it to use explicit join syntax (which shouldn't make a different to performance, as MySQL should manage to optimise it away)

SELECT c.*,
       GROUP_CONCAT(t1.name) AS tags
FROM collections c
INNER JOIN tags t ON t.collection_id = c.id
INNER JOIN tags t1 ON t1.collection_id = c.id
WHERE t.name IN ("foo", "bar")
GROUP BY c.id 
HAVING COUNT(t.id) = 2 
LIMIT 10

Might be worth doing a separate INNER JOIN for each tag you are checking, which eliminates the need for the HAVING:-

SELECT c.*,
       GROUP_CONCAT(t1.name) AS tags
FROM collections c
INNER JOIN tags t ON t.collection_id = c.id AND t.name = "foo"
INNER JOIN tags t0 ON t.collection_id = c.id AND t0.name = "bar"
INNER JOIN tags t1 ON t1.collection_id = c.id
GROUP BY c.id 
LIMIT 10

However your original query doesn't look bad so it is possibly an index issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM