[英]How to optimize the “IN (SELECT…” query
I'm trying to make a select from two tables, table_a has 600 million of rows while table_b has only 20 of them. 我正在尝试从两个表中进行选择,table_a有6亿行,而table_b只有20行。
The code currently looks something like the one below. 该代码当前看起来类似于下面的代码。
SELECT
field_1,field_2
FROM
table_a
WHERE
table_a.field_3 IN (SELECT field_3 FROM table_b WHERE field_4 LIKE 'some_phrase%')
It works fine but is very slow. 它工作正常,但是非常慢。 I guess it's slow as it has to check each of the rows with the select in WHERE.
我猜这很慢,因为它必须使用WHERE中的select检查每一行。 I thought that I could somehow make a variable with values from the select and use variable instead of a nested select, but I cannot make it work.
我以为可以用select中的值创建一个变量,然后使用变量而不是嵌套的select,但是我无法使其正常工作。 I was thinking about something like this:
我在想这样的事情:
SELECT @myVariable :=field_3 FROM table_b WHERE field_4 LIKE 'some_phrase%;
SELECT
field_1,field_2
FROM
table_a
WHERE
table_a.field_3 IN (@myVariable)
I learned that it won't work with IN()
so I also tried FIND_IN_SET
but I also couldn't make it work. 我了解到它不能与
IN()
因此我也尝试了FIND_IN_SET
但也无法使其工作。 I would appreciate any help. 我将不胜感激任何帮助。
Instead of a IN clause you could use JOIN on the subquery 代替IN子句,您可以在子查询上使用JOIN
SELECT field_1,field_2
FROM table_a
INNER JOIN (
SELECT field_3
FROM table_b
WHERE field_4 LIKE 'some_phrase%'
) t on t.field_3 = table_a.field_3
but be sure you a proper index on column field_3
of table_b
and column field_3
of table_a
但请确保在
field_3
的table_b
列和field_3
的table_a
列上有正确的索引
Actually, the assuming the subquery on table_b
is not particularly large or non performant, you might want to focus on optimizing the outer query on table_a
. 实际上,假设
table_b
上的子查询不是特别大或性能不佳,则您可能需要集中精力优化table_a
上的外部查询。 Adding an appropriate index is one option, such as: 添加适当的索引是一种选择,例如:
CREATE INDEX idx ON table_a (field_3, field_1, field_2);
This index should completely cover the WHERE
and SELECT
clauses. 该索引应完全覆盖
WHERE
和SELECT
子句。 Note that in the case of the subquery, MySQL would just evaluate it once and cache the result set somewhere. 注意,对于子查询,MySQL只会对其进行一次评估,并将结果集缓存在某个地方。 If the subquery be very large, then you might want to rewrite the query using a join:
如果子查询非常大,则您可能希望使用联接重写查询:
SELECT DISTINCT a.field_1, a.field_2
FROM table_a a
INNER JOIN table_b b
ON a.field_3 = b.field_3
WHERE
b.field_4 LIKE 'some_phrase%';
Here the following additional index might help: 以下附加索引可能会有所帮助:
CREATE INDED idx2 ON table_b (field_4, field_3);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.