简体   繁体   English

MySQL交集

[英]MySQL intersection

I've an existing site, whose DB is not designed correctly and contains lot of records, so we cant change DB structure. 我有一个现有站点,其数据库设计不正确且包含大量记录,因此我们无法更改数据库结构。

Database for current issue mainly contains 4 tables, users, questions, options and answers. 当前问题数据库主要包含4个表,用户,问题,选项和答案。 There is standard set of questions and options but for each user, there is one row in answers table for each set of question and options. 有一组标准的问题和选项,但是对于每个用户,答案表中的每一行都有一组问题和选项。 DB structure and example data is available at SQL fiddle . SQL小提琴提供了数据库结构和示例数据。

Now as a new requirement of advanced search, I need to find users by applying multiple search filters. 现在,作为高级搜索的新要求,我需要通过应用多个搜索过滤器来查找用户。 Example input and expected output is given in comments on SQL Fiddle . 示例输入和预期输出在SQL Fiddle的注释中给出。

I tried to apply all type of joins, intersection but it always fail somehow. 我尝试应用所有类型的联接,交集,但总是以某种方式失败。 Can someone please help me to write correct query, preferably light weight/optimized joins as DB contain lot of records (10000+ users, 100+ questions, 500+ options and 500000+ records in answers table)? 有人可以帮我写正确的查询,最好是轻量级/优化的联接,因为数据库包含很多记录(答案表中有10000+个用户,100 +个问题,500 +个选项和500000+个记录)?

EDIT: Based on two answers, I used following query 编辑:基于两个答案,我使用以下查询

SELECT u.id, u.first_name, u.last_name
FROM users u
    JOIN answers a ON a.user_id = u.id
WHERE (a.question_id = 1 AND a.option_id IN (3, 5))
    OR (a.question_id = 2 AND a.option_id IN (8))
GROUP BY u.id, u.first_name, u.last_name
HAVING
    SUM(CASE WHEN (a.question_id = 1 AND a.option_id IN (3, 5)) THEN 1 ELSE 0 END) >=1
    AND SUM(CASE WHEN (a.question_id = 2 AND a.option_id IN (8)) THEN 1 ELSE 0 END) >= 1;

Please note: On real database, columns user_id , question_id and option_id of answers table are indexed. 请注意:在实际数据库中, answers表的user_idquestion_idoption_id列已建立索引。

Running query given on SQL Fiddle . SQL Fiddle上给出运行查询。

SQL Fiddle for dnoeth's answer. SQL Fiddle提供了dnoeth的答案。

SQL Foddle for calcinai's answer. SQL Foddle提供了calcinai的答案。

Add all you n filters into the WHERE using OR and repeat them in a HAVING(SUM(CASE)) using AND: 使用OR将所有n个过滤器添加到WHERE中,并使用AND在HAVING(SUM(CASE))中重复它们:

SELECT u.id, u.first_name, u.last_name
FROM users u JOIN answers a
  ON a.user_id = u.id
JOIN questions q
  ON a.question_id = q.id
JOIN question_options o
  ON a.option_id = o.id
WHERE (q.question = 'Language known' AND o.OPTION IN ('French','Russian'))
   OR (q.question = 'height' AND o.OPTION = '1.51 - 1.7')
GROUP BY u.id, u.first_name, u.last_name
HAVING
  SUM(CASE WHEN (q.question = 'Language known' AND o.OPTION IN ('French','Russian')) THEN 1 ELSE 0 END) >=1
AND 
  SUM(CASE WHEN (q.question = 'height'         AND o.OPTION = '1.51 - 1.7')          THEN 1 ELSE 0 END) >= 1
;

I changed your joins into the more readable Standard SQL syntax. 我将您的联接更改为更具可读性的标准SQL语法。

This will require a bit of fiddling for a dynamic filter, but what you really want to do is search by the IDs, as it'll mean less joins and a faster query. 这将需要对动态过滤器进行一些调整,但是您真正想要做的是按ID进行搜索,因为这将意味着更少的连接和更快的查询。

This produces the results you'd expect. 这将产生您期望的结果。 I assume that the search filters are generated based off options in the database, so instead of passing the actual value back in to the query, pass the ID instead. 我假设搜索过滤器是基于数据库中的选项生成的,因此与其传递实际值而不是将其传递回查询,而是传递ID。

The multiple inner joins are to support multiple AND criteria and auto-reduce your result set. 多个内部联接将支持多个AND条件并自动缩减结果集。

SELECT * FROM users u
INNER JOIN answers a ON a.user_id = u.id
  AND (a.question_id, a.option_id) IN ((1,3),(1,5)) # q 1: Lang, answer 3/5: En/Ru
INNER JOIN answers a2 ON a2.user_id = u.id
  AND (a2.question_id, a2.option_id) = (2,8) # q 2: Height, answer 8: 1.71...
GROUP BY u.id;

I'd suggest making sure there's an index on (user_id, question_id, option_id) for searching: 我建议确保在(user_id,question_id,option_id)上有一个索引用于搜索:

ALTER TABLE `answers` ADD INDEX idx_search(`user_id`, `question_id`, `option_id`);

Otherwise it should be using primary keys for the joins (if properly defined) so it will be fast. 否则,它应该对联接使用主键(如果定义正确),这样会很快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM