简体   繁体   English

性能:WHERE IN子句与(INSERT + INNER JOIN)

[英]Performance: WHERE IN clause vs (INSERT + INNER JOIN)

I have a use-case where I need to perform a very high number of SELECT SQL 我有一个用例,需要执行大量的SELECT SQL

I have two approaches at this moment: 目前,我有两种方法:

  1. Query by a list of identifiers. 通过标识符列表进行查询。 So, I first used WHERE IN clause: 因此,我首先使用了WHERE IN子句:

    • SELECT COL1, COL2, COL3, COL4 FROM MAIN_TABLE WHERE COL1 IN ( 1,2,3,8,11,78,59,65,74,25,36,54558,78854,558 ) 从MAIN_TABLE的COL1 IN(1,2,3,8,11,78,59,65,74,25,36,54558,78854,558)中选择SELECT COL1,COL2,COL3,COL4
  2. I can create a table, let's say, CACHE_TABLE, and first INSERT the identifiers ( 1,2,3,8,11,78,59,65,74,25,36,54558,78854,558 ) into it by a unique key CACHEID and the JOIN this CACHE_TABLE with MAIN_TABLE to get the desired result: 我可以创建一个表,比如说CACHE_TABLE,然后首先将一个唯一的标识符(1,2,3,8,11,78,59,65,74,25,36,54558,78854,558)插入其中键CACHEID并将此CACHE_TABLE与MAIN_TABLE联接以获得所需结果:

    • SELECT MT.COL1, MT.COL2, MT.COL3, MT.COL4 FROM MAIN_TABLE MT JOIN CACHE_TABLE CT ON CT.IDENTIFIER = MT.COL1 WHERE CT.CACHEID = 从MAIN_TABLE中选择MT.COL1,MT.COL2,MT.COL3,MT.COL4 CT.IDENTIFIER上的MT JOIN CACHE_TABLE CT = CT.CACHEID =的MT.COL1

Performance is really critical in my use-case. 在我的用例中,性能确实至关重要。 So I wanted to know if the approach #2 would yield better performance than approach #1. 所以我想知道方法2是否会比方法1产生更好的性能。 Also, if there is a better alternative approach(s) for this 另外,如果有更好的替代方法

Thanks a ton in Advance!! 在此先感谢一吨!

your answer is best performance approach #2. 您的答案是最佳性能方法2。 In my experience IN is a very slow operator, since SQL normally evaluates it as a series of WHERE clauses separated by "OR" (WHERE x=Y OR x=Z OR...). 以我的经验,IN是一个非常慢的运算符,因为SQL通常将其评估为一系列由“ OR”(WHERE x = Y OR x = Z OR ...)分隔的WHERE子句。 As with ALL THINGS SQL though, your mileage may vary. 与ALL THINGS SQL一样,您的里程可能会有所不同。 The speed will depend a lot on indexes 速度将在很大程度上取决于索引

You need to test the two approaches. 您需要测试这两种方法。

For a single query, I would expect in to win in most cases -- simply because creating the table and then uses it requires multiple round-trips to the database. 对于单个查询,我希望in大多数情况下都会取胜-仅仅因为创建表然后使用它需要多次往返数据库。

In addition, some databases optimize constant lists (for instance, MySQL does a binary search on values rather than a sequential search). 此外,某些数据库还优化了常量列表(例如,MySQL对值进行二进制搜索,而不是顺序搜索)。

The one thing that will help either version is an index on (col1) or (col1, col2, col3, col4) . 可以帮助任一版本的一件事是(col1)(col1, col2, col3, col4)上的索引。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM