I have the problem that there is a history table which makes an extract of a table each day and gives it a timestamp. Unfortunatly the data was loaded multiple times each day in the past, which should not be.
It's like:
And should be like:
I am looking for a way to delete the duplicates based on the first timestamp for each day.
Do you have any ideas to delete the duplicates in this way?
Thank you in advance!
I would recommend deleting using a CTE:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id, CONVERT(date, ts_col) ORDER BY ts_col) rn
FROM yourTable
)
DELETE
FROM cte
WHERE rn > 1; -- targets all records per day except for the first one
If you have only two columns use aggregation:
select id, cmin(timestamp) as timestamp
from t
group by id, convert(date, timestamp);
If you have many columns and want the complete row, then row_number()
is probably the best option:
select t.*
from (select t.*,
row_number() over (partition by id, convert(date, timestamp) order by timestamp) as seqnum
from t
) t
where seqnum = 1;
You can use this select to control:
select a.* from yourtable a
inner join
(
select id,convert(date,[datetime]) [date], MIN([datetime]) [datetime]
from yourtable
group by id,convert(date,[datetime])
) b on a.id = b.id and convert(date,a.[datetime]) = b.[date] and a.[datetime] <> b.[datetime]
And the delete:
delete a from yourtable a
inner join
(
select id,convert(date,[datetime]) [date], MIN([datetime]) [datetime]
from yourtable
group by id,convert(date,[datetime])
) b on a.id = b.id and convert(date,a.[datetime]) = b.[date] and a.[datetime] <> b.[datetime]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.