如何在子字符串定义的SQL中识别和删除重复项

Question

I have a complex issue regarding de-duplication in SQL that I could use some advice on: 关于SQL中的重复数据删除，我遇到了一个复杂的问题，我可以在以下方面使用一些建议：

I have a table with airport codes. 我有一张带机场代码的桌子。 However, there are duplicates in some cases where one row lists the local airport ID, while another lists the ICAO (international) ID , which includes a leading K . 但是，在某些情况下存在重复项，其中一行列出了本地机场ID，而另一行列出了ICAO (international) ID ，其中包括前导K

I need to identify duplicates such as the following: KI80 and I80 KX49 and X49 我需要标识以下重复项： KI80和I80 KX49和X49

Note that there are many valid rows that start with a K . 请注意，有许多以K开头的有效行。

Step 1: I need to identify the duplicates for the above cases. 步骤1：我需要确定上述情况的重复项。

Step 2: I need to use SQL to automatically delete all duplicates which have the leading K . 步骤2：我需要使用SQL自动删除所有带有前导K重复项。

Step 3: I need to identify in a different table table b , which rows were using identifiers that I just deleted, so I can update them to the surviving ID (example: if they used KI80 , I need to change them to I80 in this new table") 步骤3：我需要在另一个表table b中标识哪些行正在使用我刚刚删除的标识符，因此我可以将其更新为尚存的ID（例如：如果它们使用KI80 ，则需要在此将其更改为I80新表”）

Any help would be greatly appreciated! 任何帮助将不胜感激！

Answer 1

You can use a self join in a delete statement. 您可以在delete语句中使用自我联接。 The idea is to join the table to itself, but doing the match on a "K" prefix. 想法是将表连接到自身，但使用“ K”前缀进行匹配。 If a match exists, then the "K" record is a duplicate: 如果存在匹配项，则“ K”记录是重复的：

delete t
    from table t join
         table tnotk
         on t.airportID = concat('K', tnotk.airportID) and tnotk.airportID not like 'K%'
    where t.airportID like 'K%';

Note: this assumes that no non-ICAO airport ids start with a "K". 注意：这假设没有非ICAO机场ID以“ K”开头。

如何在子字符串定义的SQL中识别和删除重复项

问题描述

1 个解决方案

解决方案1
1 2014-11-16 15:02:05

如何在子字符串定义的SQL中识别和删除重复项

问题描述

1 个解决方案

解决方案1 1 2014-11-16 15:02:05

解决方案1
1 2014-11-16 15:02:05