简体繁体 English

性能：WHERE IN子句与（INSERT + INNER JOIN）

[英]Performance: WHERE IN clause vs (INSERT + INNER JOIN)

原文 2018-08-08 10:06:34 4 2 sql/ database/ performance/ join/ where

I have a use-case where I need to perform a very high number of SELECT SQL 我有一个用例，需要执行大量的SELECT SQL

I have two approaches at this moment: 目前，我有两种方法：

Query by a list of identifiers. 通过标识符列表进行查询。 So, I first used WHERE IN clause: 因此，我首先使用了WHERE IN子句：
- SELECT COL1, COL2, COL3, COL4 FROM MAIN_TABLE WHERE COL1 IN ( 1,2,3,8,11,78,59,65,74,25,36,54558,78854,558 ) 从MAIN_TABLE的COL1 IN（1,2,3,8,11,78,59,65,74,25,36,54558,78854,558）中选择SELECT COL1，COL2，COL3，COL4
I can create a table, let's say, CACHE_TABLE, and first INSERT the identifiers ( 1,2,3,8,11,78,59,65,74,25,36,54558,78854,558 ) into it by a unique key CACHEID and the JOIN this CACHE_TABLE with MAIN_TABLE to get the desired result: 我可以创建一个表，比如说CACHE_TABLE，然后首先将一个唯一的标识符（1,2,3,8,11,78,59,65,74,25,36,54558,78854,558）插入其中键CACHEID并将此CACHE_TABLE与MAIN_TABLE联接以获得所需结果：
- SELECT MT.COL1, MT.COL2, MT.COL3, MT.COL4 FROM MAIN_TABLE MT JOIN CACHE_TABLE CT ON CT.IDENTIFIER = MT.COL1 WHERE CT.CACHEID = 从MAIN_TABLE中选择MT.COL1，MT.COL2，MT.COL3，MT.COL4 CT.IDENTIFIER上的MT JOIN CACHE_TABLE CT = CT.CACHEID =的MT.COL1

Performance is really critical in my use-case. 在我的用例中，性能确实至关重要。 So I wanted to know if the approach #2 would yield better performance than approach #1. 所以我想知道方法2是否会比方法1产生更好的性能。 Also, if there is a better alternative approach(s) for this 另外，如果有更好的替代方法

Thanks a ton in Advance!! 在此先感谢一吨！

2 个解决方案

your answer is best performance approach #2. 您的答案是最佳性能方法2。 In my experience IN is a very slow operator, since SQL normally evaluates it as a series of WHERE clauses separated by "OR" (WHERE x=Y OR x=Z OR...). 以我的经验，IN是一个非常慢的运算符，因为SQL通常将其评估为一系列由“ OR”（WHERE x = Y OR x = Z OR ...）分隔的WHERE子句。 As with ALL THINGS SQL though, your mileage may vary. 与ALL THINGS SQL一样，您的里程可能会有所不同。 The speed will depend a lot on indexes 速度将在很大程度上取决于索引

You need to test the two approaches. 您需要测试这两种方法。

For a single query, I would expect in to win in most cases -- simply because creating the table and then uses it requires multiple round-trips to the database. 对于单个查询，我希望in大多数情况下都会取胜-仅仅因为创建表然后使用它需要多次往返数据库。

In addition, some databases optimize constant lists (for instance, MySQL does a binary search on values rather than a sequential search). 此外，某些数据库还优化了常量列表（例如，MySQL对值进行二进制搜索，而不是顺序搜索）。

The one thing that will help either version is an index on (col1) or (col1, col2, col3, col4) . 可以帮助任一版本的一件事是(col1)或(col1, col2, col3, col4)上的索引。