简体   繁体   中英

How to optimize the “IN (SELECT…” query

I'm trying to make a select from two tables, table_a has 600 million of rows while table_b has only 20 of them.

The code currently looks something like the one below.

        SELECT
            field_1,field_2
        FROM
            table_a
        WHERE
             table_a.field_3 IN (SELECT field_3 FROM table_b WHERE field_4 LIKE 'some_phrase%')

It works fine but is very slow. I guess it's slow as it has to check each of the rows with the select in WHERE. I thought that I could somehow make a variable with values from the select and use variable instead of a nested select, but I cannot make it work. I was thinking about something like this:

SELECT  @myVariable :=field_3 FROM table_b WHERE field_4 LIKE 'some_phrase%;

        SELECT
            field_1,field_2
        FROM
            table_a
        WHERE
             table_a.field_3 IN (@myVariable)

I learned that it won't work with IN() so I also tried FIND_IN_SET but I also couldn't make it work. I would appreciate any help.

Instead of a IN clause you could use JOIN on the subquery

  SELECT field_1,field_2
  FROM  table_a
  INNER JOIN  (
    SELECT field_3 
    FROM table_b 
    WHERE field_4 LIKE 'some_phrase%'
 ) t on t.field_3 =   table_a.field_3 

but be sure you a proper index on column field_3 of table_b and column field_3 of table_a

Actually, the assuming the subquery on table_b is not particularly large or non performant, you might want to focus on optimizing the outer query on table_a . Adding an appropriate index is one option, such as:

CREATE INDEX idx ON table_a (field_3, field_1, field_2);

This index should completely cover the WHERE and SELECT clauses. Note that in the case of the subquery, MySQL would just evaluate it once and cache the result set somewhere. If the subquery be very large, then you might want to rewrite the query using a join:

SELECT DISTINCT a.field_1, a.field_2
FROM table_a a
INNER JOIN table_b b
    ON a.field_3 = b.field_3
WHERE
    b.field_4 LIKE 'some_phrase%';

Here the following additional index might help:

CREATE INDED idx2 ON table_b (field_4, field_3);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM