简体   繁体   中英

'where' in with duplicates or 'select distinct' - performance


I would like to return some results basing on subquery results and I was wondering what would be faster - having subquery results be distinct or where in clause having duplicates?
Example

select * 
from some_table 
where my_column in (select /* distinct? */ my_column from some_table_2)

If

-- subquery
select /*distinct?*/ my_column from some_table_2;

This would return something like:

  | MY_COLUMN
1 | 25
2 | 25
3 | 54
...

This example may not make sense but it presents my point.

There is no benefit to putting distinct in the subquery for in . An in does what is called a "semi-join". This stops at the first matching row.

Oracle probably ignores the select distinct . Other databases might actually implement it.

If you care about performance, try both these versions:

select t.* 
from some_table t
where my_column in (select my_column from some_table_2)

and

select t.*
from some_table t
where exists (select 1 from some_table_2 t2 where t2.my_column = t.my_column);

This version can take advantage of an index on some_table_2(my_column) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM