简体   繁体   中英

Best approach to SQL Server query using distinct

I've two tables: TableA and TableB, joined by TableA.TableA_Id->1..n<-TableB.TableA_Id. A simple PK-FK.

I need to extract distinct TableA records given a certain condition on TableB. Here's my 1st approach:

SELECT * FROM TableA A INNER JOIN TableB B ON A.idA = B.IdA AND B.Date = '2009-01-10' ORDER BY A.Id;

This is nice, but it doesn't give me "distinct" records. Some records on table B may be duplicate and hence I could get the same records more than once.

So I decided to perform a subselect (performance is not an issue given that the subselect will probably end up with 20/30 records max):

SELECT * FROM TableA WHERE TableA.Id IN ( SELECT DISTINCT IdA FROM TableB WHERE Date = '20090110' ) ORDER BY TableA.IdA;

This works fine.

Now the question is: how can I use the Inner Join and still get the distinct values? Is this possible in one pass or the nested query is a must? What am I missing?

use a derived table

SELECT * FROM TableA 
JOIN
(SELECT DISTINCT IdA FROM TableB WHERE Date = '20090110') a
ON a.IDA = TAbleA.IDA
ORDER BY TableA.IdA

I think a normal exists statement is what you need:

SELECT * 
FROM TableA A 
WHERE Exists( select B.IdA from TableB B where A.IdA = B.IdA and B.Date = '2009-01-10' )
ORDER BY A.Id;

Performance-wise it should be the best approach.

If you need values from the other table, and to avoid using distinct, you could join to a sub query that is grouped like so:

Select TableA.*, groupedTableB.OtherData 
From TableA
Inner join 
(
    select TableB.IdA, Sum(TableB.Data) SummaryData
    from TableB where TableB.Date = '2009-01-10'
    group by TableB.IdA

) groupedTableB on groupedTableB.IdA = TableA.IdA

And the problem using

SELECT DISTINCT * FROM TableA A INNER JOIN TableB B ON A.idA = B.IdA 
AND B.Date = '2009-01-10' ORDER BY A.Id;

is?

If it's that it does return duplicate idA values, that's because you are selecting too many columns, if you can't reduce what you select, you need the subquery.

How about this :

select * from TableA a 
  where a.TableA_Id in
(select TableA_Id from TableB where [Date] = '2009-01-10')
order by a.TableA_Id

You could add distinct to the subquery if you want, I'm not sure whether this will improve or reduce preformance offhand.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM