T-SQL Delete half of duplicates with no primary key

Question

In a T-SQL stored procedure I have a complex procedure that is comparing data using temp tables but at the end of everything when I return a single table I end up with duplicate rows. In these rows all columns in the row are EXACTLY the same and there is no primary key within this table. I need to delete only half of these based on the number of times that row occurs. For example if there are eight rows that are all the same value. I want to delete four of them.

There is no way to get rid of them through my SP filtering because the data that is entered is literally duplicate information entered in by the user but I do required half of that information.

I've done some research on the subject and did some testing but it seems as if it's not possible to delete half of the duplicated rows. Is this not possible? Or is there a way?

Answer 1

Here is one way, using a great feature of SQL Server, updatable CTEs:

with todelete as (
      select t.*,
             row_number() over (partition by col1, col2, col3, . . . order by newid()) as seqnum
      from table t
     )
delete from todelete
    where seqnum % 2 = 0;

This will delete every other value.

Answer 2

Assuming SQL Server 2005+:

;WITH CTE AS
(
    SELECT  *,
            RN=ROW_NUMBER() OVER(PARTITION BY Col1, Col2,...Coln ORDER BY Col1)
    FROM YourTempTableHere
)
DELETE FROM CTE
WHERE RN = 1

T-SQL Delete half of duplicates with no primary key

Question

2 answers

solution1
3 ACCPTED 2014-04-28 20:48:36

solution2
0 2014-04-28 20:48:09

T-SQL Delete half of duplicates with no primary key

Question

2 answers

solution1 3 ACCPTED 2014-04-28 20:48:36

solution2 0 2014-04-28 20:48:09

solution1
3 ACCPTED 2014-04-28 20:48:36

solution2
0 2014-04-28 20:48:09