简体   繁体   English

SQL:删除重复行,同时保留另一列中具有最高值的行

[英]SQL: Removing Duplicates rows while retaining the row with highest value in another column

Suppose i have a table Test with Data:假设我有一个带有数据的表测试:

SOID SO_Name   SO_Desc     PRIORITY  ADE_PRIORITIZED  DEPLOY_DATE  ENV
123  SO1      SO1 Desc1      111      Y               01-JAN-01     0
123  SO1      SO1 Desc1      111      Y               01-JAN-01     1
123  SO1      SO1 Desc1      111      Y               01-JAN-01     2
123  SO1      SO1 Desc1      111      Y               01-JAN-01     3
987  SO1      SO1 Desc1      111      Y               01-JAN-01     0
987  SO1      SO1 Desc1      111      Y               01-JAN-16     1
987  SO1      SO1 Desc1      111      Y               21-JAN-17     2
987  SO1      SO1 Desc1      111      Y               01-JAN-17     3
121  SO121    SO121 Desc121  111      Y               01-JAN-17     0

I want to remove the duplicate rows for each soid (duplicate can be based on the 4 columns: so_name,so_desc,priority, ade_prioritized) retaining the row with highest deploy_date.我想删除每个实体的重复行(重复可以基于 4 列:so_name、so_desc、priority、ade_prioritized)保留部署日期最高的行。

I used this query but it doesn't delete any row.我使用了这个查询,但它不会删除任何行。

delete from so_test a 
where a.deploy_date < (
  select max(b.deploy_date) from so_test b where a.soid = b.soid
);

0 rows deleted

The end result i expect should be: SOID SO_Name SO_Desc PRIORITY ADE_PRIORITIZED DEPLOY_DATE ENV 123 SO1 SO1 Desc1 111 Y 01-JAN-01 0 987 SO1 SO1 Desc1 111 Y 21-JAN-17 2 987 SO1 SO1 Desc1 111 Y 21-JAN-17 2我期望的最终结果应该是: SOID SO_Name SO_Desc PRIORITY ADE_PRIORITIZED DEPLOY_DATE ENV 123 SO1 SO1 Desc1 111 Y 01-JAN-01 0 987 SO1 SO1 Desc1 111 Y 21-JAN-17 2 SO 91J-1717 SO 2

What can be the issue?可能是什么问题? can it be done without CTE?没有CTE可以做吗?

Using with (common table expression) and row_number() you can both identify and then easily handle duplicates:使用with (common table expression)row_number()您可以识别并轻松处理重复项:

When using a ctes, you can only perform one statement after the expression (unless you are chaining ctes or using multiple ctes).使用 ctes 时,您只能在表达式后执行一个语句(除非您正在链接 ctes 或使用多个 ctes)。

In the following code example you would first check the output by using the select, then if further actions are necessary, comment out the select query and un-comment the delete query.在以下代码示例中,您将首先使用选择检查输出,然后如果需要进一步操作,请注释掉选择查询并取消注释删除查询。

rextester link: http://rextester.com/UFQQ51693 reextester 链接: http ://rextester.com/UFQQ51693

with cte as (
  select   
      *
    , rn = row_number() over (
            partition by soid 
            order by deploy_date desc
            )
    from [so_test]
)
/* --------------------------------------------------------------
-- This returns all of rows with values that have duplicates
-- along the row number (rn) so you can see which rows 
-- would be affected by the following actions
-------------------------------------------------------------- */
/*
select o.*
  from cte as o
  where exists (
      select 1
        from cte as i
        where cte.soid  = i.soid 
          and i.rn>1
      );
--*/
/* --------------------------------------------------------------
-- Remove duplicates by deleting all of the duplicates
-- where the row number (rn) is greater than 1
-- without deleting the first row of the duplicates.
-------------------------------------------------------------- */
--/*
delete 
  from cte 
  where cte.rn > 1 
--*/

rextester reults after delete:删除后 reextester 结果:

+------+---------+---------------+----------+-----------------+---------------------+-----+
| soid | so_name |    so_desc    | priority | ade_prioritized |     deploy_date     | env |
+------+---------+---------------+----------+-----------------+---------------------+-----+
|  123 | SO1     | SO1_Desc1     |      111 | Y               | 01.01.2001 00:00:00 |   0 |
|  987 | SO1     | SO1_Desc1     |      111 | Y               | 21.01.2017 00:00:00 |   2 |
|  121 | SO121   | SO121_Desc121 |      111 | Y               | 01.01.2017 00:00:00 |   0 |
+------+---------+---------------+----------+-----------------+---------------------+-----+

Example based on preserving non duplicates into a new table.基于将非重复项保留到新表中的示例。

create table so_test_nodups 
as
with dups as 
( select soid, so_name, so_desc, priority, ade_prioritized, deploy_date, env,  
        row_number() over ( partition by so_name, so_desc, priority, ade_prioritized order by deploy_date desc ) rn 
  from so_test 
) 
select  soid, so_name, so_desc, priority, ade_prioritized, deploy_date, env 
from dups 
where rn=1 

Querying the so_test_nodups table.查询 so_test_nodups 表。

select * from so_test_nodups

      SOID SO_NAME    SO_DESC                PRIORITY A DEPLOY_DA        ENV
---------- ---------- -------------------- ---------- - --------- ----------
       123 SO1        SO1 Desc1                   111 Y 01-JAN-01          0
       121 SO121      SO121 Desc121               111 Y 01-JAN-17          0

Adding the results after the edits provided:在提供的编辑后添加结果:

      SOID SO_NAME    SO_DESC                PRIORITY A DEPLOY_DA        ENV
---------- ---------- -------------------- ---------- - --------- ----------
       987 SO1        SO1 Desc1                   111 Y 21-JAN-17          2
       121 SO121      SO121 Desc121               111 Y 01-JAN-17          0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM