[英]SQL Server database delete duplicates trend values, but leave first and last
我有一个充满温度值的大型数据库。 问题是系统保存了大量重复值,现在SQL Server 数据库已满(Express 版)。
部分数据样本,使用查询
SELECT [PointID], [Time], [Data]
FROM TrendSamples
WHERE pointid = 13
ORDER BY time DESC
结果:
PointID Time Data
---------------------------------------------
13 2020-01-02 07:29:01.077 22,2999992370605
13 2020-01-02 07:28:50.937 22,5
13 2020-01-02 07:28:05.230 22,2999992370605
13 2020-01-02 07:27:55.090 22,3999996185303
13 2020-01-02 07:27:04.510 22,3999996185303
13 2020-01-02 07:26:13.443 22,3999996185303
13 2020-01-02 07:25:22.580 22,3999996185303
13 2020-01-02 07:24:31.340 22,3999996185303
13 2020-01-02 07:23:40.370 22,3999996185303
13 2020-01-02 07:22:49.460 22,3999996185303
13 2020-01-02 07:21:59.160 22,3999996185303
13 2020-01-02 07:21:08.483 22,3999996185303
13 2020-01-02 07:20:17.713 22,3999996185303
13 2020-01-02 07:19:26.710 22,3999996185303
13 2020-01-02 07:18:35.283 22,3999996185303
13 2020-01-02 07:17:44.250 22,3999996185303
13 2020-01-02 07:16:53.463 22,3999996185303
13 2020-01-02 07:16:02.367 22,3999996185303
13 2020-01-02 07:15:11.083 22,3999996185303
13 2020-01-02 07:14:19.987 22,3999996185303
13 2020-01-02 07:13:29.230 22,3999996185303
13 2020-01-02 07:12:38.197 22,3999996185303
13 2020-01-02 07:11:47.957 22,3999996185303
13 2020-01-02 07:10:57.033 22,3999996185303
13 2020-01-02 07:10:06.293 22,3999996185303
13 2020-01-02 07:09:15.183 22,3999996185303
13 2020-01-02 07:08:24.083 22,3999996185303
13 2020-01-02 07:07:33.237 22,3999996185303
13 2020-01-02 07:06:42.140 22,3999996185303
13 2020-01-02 07:05:51.557 22,3999996185303
13 2020-01-02 07:05:00.787 22,3999996185303
13 2020-01-02 07:04:09.707 22,3999996185303
13 2020-01-02 07:03:18.970 22,3999996185303
13 2020-01-02 07:02:28.043 22,3999996185303
13 2020-01-02 07:01:36.930 22,3999996185303
13 2020-01-02 07:00:46.317 22,3999996185303
13 2020-01-02 06:59:55.390 22,3999996185303
13 2020-01-02 06:59:04.403 22,3999996185303
13 2020-01-02 06:58:13.103 22,3999996185303
13 2020-01-02 06:58:01.247 22,5
如您所见,有很多重复数据,其中的数据在不同时间之间根本没有变化。 有什么方法可以删除所有重复数据,但在值更改之前和之后保留第一行和最后一行。
结果我想要的是这个
PointID Time Data
---------------------------------------------
13 2020-01-02 07:29:01.077 22,2999992370605
13 2020-01-02 07:28:50.937 22,5
13 2020-01-02 07:28:05.230 22,2999992370605
13 2020-01-02 07:27:55.090 22,3999996185303
13 2020-01-02 06:58:13.103 22,3999996185303
13 2020-01-02 06:58:01.247 22,5
您可以为此使用可更新的 CTE:
with cte as (
select
t.*,
lag(data) over(partition by pointID order by time) lag_data,
lead(data) over(partition by pointID order by time) lead_data
from mytable t
)
delete from cte where (data = lag_data and data = lead_data)
CTE 使用lag()
和lead()
来引入具有相同pointID
的前一行和下一行的data
值,按time
排序。 然后外部查询删除data
与上一条和下一条记录相同的记录。
假设您的表具有自动生成的带有唯一编号的 ID 列。
在 cte 表达式中使用 RowNumber() 两次,我们可以使用自动生成的 Id 列按 asc 和 desc 顺序对数据进行分区,并删除行。
with result as (
select *,
Row_Number() over(partition by Data order by ID) rownumber1,
Row_Number() over(partition by Data order by ID desc) rownumber2
from TrendSamples
)
delete from result where rownumber1 > 1 and rownumber2 > 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.