简体   繁体   English

SQL - 输入与非输入

[英]SQL - IN vs. NOT IN

Suppose I have a table with column which takes values from 1 to 10. I need to select columns with all values except for 9 and 10. Will there be a difference (performance-wise) when I use this query:假设我有一个包含从 1 到 10 的列的表。我需要选择除 9 和 10 之外的所有值的列。当我使用这个查询时会不会有区别(性能方面):

SELECT * FROM tbl WHERE col NOT IN (9, 10)

and this one?和这个?

SELECT * FROM tbl WHERE col IN (1, 2, 3, 4, 5, 6, 7, 8)

Use "IN" as it will most likely make the DBMS use an index on the corresponding column.使用“IN”,因为它很可能会使 DBMS 使用相应列上的索引。

"NOT IN" could in theory also be translated into an index usage, but in a more complicated way which DBMS might not "spend overhead time" using. “NOT IN”理论上也可以翻译成索引用法,但以一种更复杂的方式,DBMS 可能不会“花费开销时间”使用。

When it comes to performance you should always profile your code (ie run your queries few thousand times and measure each loops performance using some kind of stopwatch . Sample ).当谈到性能时,您应该始终分析您的代码(即运行您的查询几千次并使用某种stopwatch来测量每个循环的性能。示例)。

But here I highly recommend using the first query for better future maintaining.但在这里我强烈建议使用第一个查询以更好地维护。 The logic is that you need all records but 9 and 10. If you add value 11 to your table and use second query, logic of your application will be broken that will lead to bug, of course.逻辑是您需要除 9 和 10 之外的所有记录。如果您将值 11 添加到您的表并使用第二个查询,您的应用程序逻辑将被破坏,这将导致错误,当然。

Edit: I remember this was tagged as php that's why I provided sample in php, but I might be mistaken.编辑:我记得这被标记为 php 这就是我在 php 中提供示例的原因,但我可能会误会。 I guess it won't be hard to rewrite that sample in the language you're using.我想用您正在使用的语言重写该示例并不难。

I have seen Oracle have trouble optimizing some queries with NOT IN if columns are nullable.如果列可以为空,我已经看到 Oracle 无法使用 NOT IN 优化某些查询。 If you can write your query either way, IN is preferred as far as I'm concerned.如果您可以用任何一种方式编写查询,就我而言,IN 是首选。

For a list of constants, MySQL will internally expand your code to:对于常量列表,MySQL 会在内部将您的代码扩展为:

SELECT * FROM tbl WHERE ((col <> 9 and col <> 10))

Same for the other one, with 8 times = instead.另一个相同,用 8 次=代替。

So yes, the first one will be faster, less comparisons to be done.所以是的,第一个会更快,比较少。 Chances that it is measurable are negligible though, the overhead of a handful of constant comparisons is nothing compared to the general overhead of parsing SQL and retrieving data.可测量的可能性可以忽略不计,但与解析 SQL 和检索数据的一般开销相比,少量常量比较的开销微不足道。

"IN" statement works internally like a serie of "OR" statements. “IN”语句在内部像一系列“OR”语句一样工作。

For example:例如:

SELECT * FROM tbl WHERE col IN (1, 2, 3)

Its equals to它等于

SELECT * FROM tbl WHERE col = 1 OR col = 2 OR col = 3

"OR" statements could cause some performance issues as explained in this article: https://bertwagner.com/2018/02/20/or-vs-union-all-is-one-better-for-performance/ “OR”语句可能会导致一些性能问题,如本文所述: https : //bertwagner.com/2018/02/20/or-vs-union-all-is-one-better-for-performance/

When you do a NOT IN statement, its all the same, but the result has a logical denial.当你做一个 NOT IN 语句时,它都是一样的,但结果是逻辑拒绝。 BUT, you could write and equivalent query much better in performance.但是,您可以在性能上更好地编写和等效查询。 In your example:在你的例子中:

SELECT * FROM tbl WHERE col NOT IN (9, 10)

Its equals to它等于

SELECT * FROM tbl WHERE col <> 9 AND col <> 10

With an "AND" statement, the database stop analizing when one of all conditionals its false, so, its much better in performance than "OR" used in "IN" statement.使用“AND”语句,数据库在所有条件之一为假时停止分析,因此,它的性能比“IN”语句中使用的“OR”要好得多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM