This has always bugged me, why does this query
SELECT
*
FROM
`TABLE`
WHERE `value` IN
(SELECT
val
FROM
OTHER_TABLE
WHERE `date` < '2014-01-01')
run orders of magnitude slower than sequentially running both this query
SELECT
`val`
FROM
OTHER_TABLE
WHERE `date` < '2014-01-01'
Result:
+----+
| val |
+-----+
| v1 |
| v2 |
| v3 |
| v7 |
| v12 |
+-----+
and this query:
SELECT
*
FROM
`TABLE`
WHERE `value` IN ('v1', 'v2', 'v3', 'v7', 'v12')
From the docs: (emphasis added by me)
Subquery optimization for
IN
is not as effective as for the=
operator or for theIN(value_list)
operator.A typical case for poor
IN
subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.The problem is that, for a statement that uses an
IN
subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery:
SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);
The optimizer rewrites the statement to a correlated subquery:
SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);
If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of
O(M×N)
, rather thanO(M+N)
as it would be for an uncorrelated subquery.An implication is that an
IN
subquery can be much slower than a query written using anIN(value_list)
operator that lists the same values that the subquery would return.
http://dev.mysql.com/doc/refman/5.7/en/subquery-restrictions.html
Hopes this helps anyone else who might have been curious
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.