'where' in with duplicates or 'select distinct' - performance

Question

I would like to return some results basing on subquery results and I was wondering what would be faster - having subquery results be distinct or where in clause having duplicates?
Example

select * 
from some_table 
where my_column in (select /* distinct? */ my_column from some_table_2)

If

-- subquery
select /*distinct?*/ my_column from some_table_2;

This would return something like:

  | MY_COLUMN
1 | 25
2 | 25
3 | 54
...

This example may not make sense but it presents my point.

Answer 1

There is no benefit to putting distinct in the subquery for in . An in does what is called a "semi-join". This stops at the first matching row.

Oracle probably ignores the select distinct . Other databases might actually implement it.

If you care about performance, try both these versions:

select t.* 
from some_table t
where my_column in (select my_column from some_table_2)

and

select t.*
from some_table t
where exists (select 1 from some_table_2 t2 where t2.my_column = t.my_column);

This version can take advantage of an index on some_table_2(my_column) .

'where' in with duplicates or 'select distinct' - performance

Question

1 answers

solution1
3 2017-07-30 12:14:07

'where' in with duplicates or 'select distinct' - performance

Question

1 answers

solution1 3 2017-07-30 12:14:07

solution1
3 2017-07-30 12:14:07