简体   繁体   中英

SQL - Delete all duplicates but the highest of each group for each user ID

It looks like we might have some duplicate values in our preference table so I want to delete all the duplicate entries for each user but the more recent one . I've spent quite a while trying to come up with a solution and I just keep running around in circles. I've tried max, grouped by, sub-queries, etc. It's pretty easy to delete the highest, but not everything but the highest. And especially for each user

So in the database I might have:

id  :   name:       value   :   userId
1   :   sortOrder:  Asc     :   1
2   :   sortOrder:  Desc    :   1
3   :   sortOrder:  Asc     :   2
4   :   something:  value2  :   1

So in this case, I want to remove the first row since it's duplicate for User 1 and has the highest id and keep all the rest of the columns.

So I know it's at least a sub-query with group by, but I just can't figure it out. So far I have:

SELECT 
    MAX(id),
    name
FROM 
    preference
GROUP BY
    name

which gives me a list of those I want to keep, BUT it still misses for each user . So I somewhat want the opposite if I had it for each user. However I don't believe I can use <> in a sub-query anyways.

What I want to ultimately do is say delete all those that have a lower ID for the same User that are duplicate entries.

One way, assuming Id column is a unique Id:

DELETE FROM preference
where Id NOT IN (SELECT MAX(id) FROM preference GROUP BY name, userId)

[Note: The usual advice applies. don't run in production without testing first]

Test First with:

SELECT * FROM preference
where Id NOT IN (SELECT MAX(id) FROM preference GROUP BY name, userId)

SQLFiddle is here . Works as intended. I suggest testing with a bigger data set though...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM