Performance Issue in Left outer join Sql server

Question

In my project I need find difference task based on old and new revision in the same table.

id     |    task   |   latest_Rev
1             A            N
1             B            N
2             C            Y
2             A            Y
2             B            Y

Expected Result:

   id       |     task   | latest_Rev   
   2               C           Y

So I tried following query

  Select new.* 
  from Rev_tmp nw with (nolock)
  left outer 
  join rev_tmp  old with (nolock)
  on   nw.id -1  = old.id
  and  nw.task = old.task
  and  nw.latest_rev = 'y'
  where old.task is null

when my table have more than 20k records this query takes more time? How to reduce the time?

In my company don't allow to use subquery

Answer 1

Use LAG function to remove the self join

SELECT *
FROM   (SELECT *,
               CASE WHEN latest_Rev = 'y' THEN Lag(latest_Rev) OVER(partition BY task ORDER BY id) ELSE NULL END AS prev_rev
        FROM   Rev_tmp) a
WHERE  prev_rev IS NULL

Answer 2

latest_Rev should be a Bit type (boolean equivalent), i better for performance (Detail here )
May be can you add index on id, task , latest_Rev columns

You can try this query (replace left outer by not exists)

Select * 
from Rev_tmp nw
where nw.latest_rev = 'y' and not exists
(
select * from rev_tmp  old
where nw.id -1  = old.id and  nw.task = old.task
)

Answer 3

My answer assumes

You can't change the indexes
You can't use subqueries
All fields are indexed separately

If you look at the query, the only value that really reduces the resultset is latest_rev='Y' . If you were to eliminate that condition, you'd definitely get a table scan. So we want that condition to be evaluated using an index. Unfortunately a field that just values 'Y' and 'N' is likely to be ignored because it will have terrible selectivity. You might get better performance if you coax SQL Server into using it anyway. If the index on latest_rev is called idx_latest_rev then try this:

Set transaction isolated level read uncommitted

Select new.* 
from Rev_tmp nw with (index(idx_latest_rev))
left outer 
join rev_tmp  old 
on   nw.id -1  = old.id
and  nw.task = old.task
where old.task is null
and  nw.latest_rev = 'y'

Performance Issue in Left outer join Sql server

Question

3 answers

solution1
5 ACCPTED 2017-06-16 08:08:50

solution2
0 2017-06-16 08:04:07

solution3
0 2017-06-16 08:22:13

Performance Issue in Left outer join Sql server

Question

3 answers

solution1 5 ACCPTED 2017-06-16 08:08:50

solution2 0 2017-06-16 08:04:07

solution3 0 2017-06-16 08:22:13

solution1
5 ACCPTED 2017-06-16 08:08:50

solution2
0 2017-06-16 08:04:07

solution3
0 2017-06-16 08:22:13