简体   繁体   中英

SQL Server - select distinct rows by two columns

I have the following format

Name   someString     theValue 
abc    bla bla         3
abc    bla bla         3
abd    bla bla         5
abd    bla bla         5
xfz    ffs ds ds       50
xfz    ffs ds ds       50

expected result:

Name   someString     theValue 
abc    bla bla         3
abd    bla bla         5
xfz    ffs ds ds       50

Tried to solve my problem using this query:

Select Name, someString, theValue count(*) FROM myTable
        GROUP BY Name, someString, theValue
          HAVING count(*) = 1

The logic here being: append to each row a count number, and select only the first one which was counted per set of rows.

This shrinks my result from 2160 rows to 40, but there are still duplicates here.

My question is : Is the logic I used in the query corect (for filtering out duplicate rows by 2 columns), and would you have written this in another way?

Using SQL Server 2012

Thank you

Surely what you want to do is

SELECT DISTINCT Name, someString, theValue FROM myTable

Or if you need to know how many occurances then

SELECT Name, someString, theValue, Count(*) as ct FROM myTable GROUP BY Name, someString, theValue

As per your expected output this query can be simply done as

SELECT DISTINCT Name, someString, theValue
FROM myTable

resultset will have distinct combination of Name, someString, theValue.

If you also want count of each combination then you can implement it as

SELECT Name, someString, theValue, count(*) combination_count
FROM myTable
GROUP BY Name, someString, theValue

It may be a long shot, but all you have written about the issue suggests that these duplicates are not actually duplicates. Perhaps there are some trailing spaces which cause troubles? In that case, the following should work: SELECT DISTINCT RTRIM(name), RTRIM(someString), theValue FROM myTable;

If the problem with duplicates is with theValue, you can try using:

SELECT DISTINCT name, someString, first_value(theValue) over (partition by name, someString) FROM myTable;

You'll need to remove "theValue" from both the select and group by statements.

You properly have multiple "theValue" for individual names / someStrings.

I would suggest a simple aggregation as the simplest approach:

select name, someString, min(theValue) as theValue
from t
group by name, someString;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM