简体   繁体   中英

T-SQL Delete half of duplicates with no primary key

In a T-SQL stored procedure I have a complex procedure that is comparing data using temp tables but at the end of everything when I return a single table I end up with duplicate rows. In these rows all columns in the row are EXACTLY the same and there is no primary key within this table. I need to delete only half of these based on the number of times that row occurs. For example if there are eight rows that are all the same value. I want to delete four of them.

There is no way to get rid of them through my SP filtering because the data that is entered is literally duplicate information entered in by the user but I do required half of that information.

I've done some research on the subject and did some testing but it seems as if it's not possible to delete half of the duplicated rows. Is this not possible? Or is there a way?

Here is one way, using a great feature of SQL Server, updatable CTEs:

with todelete as (
      select t.*,
             row_number() over (partition by col1, col2, col3, . . . order by newid()) as seqnum
      from table t
     )
delete from todelete
    where seqnum % 2 = 0;

This will delete every other value.

Assuming SQL Server 2005+:

;WITH CTE AS
(
    SELECT  *,
            RN=ROW_NUMBER() OVER(PARTITION BY Col1, Col2,...Coln ORDER BY Col1)
    FROM YourTempTableHere
)
DELETE FROM CTE
WHERE RN = 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM