简体   繁体   English

避免在 SQL Server 中的 INSERT INTO SELECT 查询中重复

[英]Avoid duplicates in INSERT INTO SELECT query in SQL Server

I have the following two tables:我有以下两个表:

Table1
----------
ID   Name
1    A
2    B
3    C

Table2
----------
ID   Name
1    Z

I need to insert data from Table1 to Table2 .我需要将数据从Table1插入到Table2 I can use the following syntax:我可以使用以下语法:

INSERT INTO Table2(Id, Name) SELECT Id, Name FROM Table1

However, in my case, duplicate IDs might exist in Table2 (in my case, it's just " 1 ") and I don't want to copy that again as that would throw an error.但是,在我的情况下, Table2中可能存在重复的 ID(在我的情况下,它只是“ 1 ”),我不想再次复制它,因为这会引发错误。

I can write something like this:我可以这样写:

IF NOT EXISTS(SELECT 1 FROM Table2 WHERE Id=1)
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1 
ELSE
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1 WHERE Table1.Id<>1

Is there a better way to do this without using IF - ELSE ?在不使用IF - ELSE的情况下有没有更好的方法来做到这一点? I want to avoid two INSERT INTO-SELECT statements based on some condition.我想避免基于某些条件的两个INSERT INTO-SELECT语句。

Using NOT EXISTS :使用NOT EXISTS

INSERT INTO TABLE_2
  (id, name)
SELECT t1.id,
       t1.name
  FROM TABLE_1 t1
 WHERE NOT EXISTS(SELECT id
                    FROM TABLE_2 t2
                   WHERE t2.id = t1.id)

Using NOT IN :使用NOT IN

INSERT INTO TABLE_2
  (id, name)
SELECT t1.id,
       t1.name
  FROM TABLE_1 t1
 WHERE t1.id NOT IN (SELECT id
                       FROM TABLE_2)

Using LEFT JOIN/IS NULL :使用LEFT JOIN/IS NULL

INSERT INTO TABLE_2
  (id, name)
   SELECT t1.id,
          t1.name
     FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.id = t1.id
    WHERE t2.id IS NULL

Of the three options, the LEFT JOIN/IS NULL is less efficient.在这三个选项中, LEFT JOIN/IS NULL效率较低。 See this link for more details .有关详细信息,请参阅此链接

In MySQL you can do this:在 MySQL 中,您可以这样做:

INSERT IGNORE INTO Table2(Id, Name) SELECT Id, Name FROM Table1

Does SQL Server have anything similar? SQL Server 有类似的东西吗?

我刚刚遇到了类似的问题,DISTINCT 关键字很神奇:

INSERT INTO Table2(Id, Name) SELECT DISTINCT Id, Name FROM Table1

I was facing the same problem recently...我最近也面临同样的问题...
Heres what worked for me in MS SQL server 2017...这是在 MS SQL Server 2017 中对我有用的东西......
The primary key should be set on ID in table 2...主键应设置在表 2 中的 ID 上...
The columns and column properties should be the same of course between both tables.两个表之间的列和列属性当然应该相同。 This will work the first time you run the below script.这将在您第一次运行以下脚本时起作用。 The duplicate ID in table 1, will not insert...表 1 中的重复 ID,不会插入...

If you run it the second time, you will get a如果你第二次运行它,你会得到一个

Violation of PRIMARY KEY constraint error违反 PRIMARY KEY 约束错误

This is the code:这是代码:

Insert into Table_2
Select distinct *
from Table_1
where table_1.ID >1

Using ignore Duplicates on the unique index as suggested by IanC here was my solution for a similar issue, creating the index with the Option WITH IGNORE_DUP_KEY 按照 IanC 的建议,在唯一索引上使用ignore Duplicates项是我针对类似问题的解决方案,使用选项WITH IGNORE_DUP_KEY创建索引

In backward compatible syntax
, WITH IGNORE_DUP_KEY is equivalent to WITH IGNORE_DUP_KEY = ON.

Ref.: index_option参考: index_option

From SQL Server you can set a Unique key index on the table for (Columns that needs to be unique)在 SQL Server 中,您可以在表上为(需要唯一的列)设置唯一键索引

从 sql server 右键单击​​ table design 选择 Indexes/Keys

选择不重复的列,然后输入唯一键

A little off topic, but if you want to migrate the data to a new table, and the possible duplicates are in the original table , and the column possibly duplicated is not an id, a GROUP BY will do:有点题外话了,但是如果您想将数据迁移到新表中,并且可能的重复项在原始表中,并且可能重复的列不是 id,则GROUP BY将执行以下操作:

INSERT INTO TABLE_2
(name)
  SELECT t1.name
  FROM TABLE_1 t1
  GROUP BY t1.name

In my case, I had duplicate IDs in the source table, so none of the proposals worked.在我的例子中,我在源表中有重复的 ID,所以没有一个提议有效。 I don't care about performance, it's just done once.我不在乎性能,它只完成一次。 To solve this I took the records one by one with a cursor to ignore the duplicates.为了解决这个问题,我用光标一一记录以忽略重复项。

So here's the code example:所以这里是代码示例:

DECLARE @c1 AS VARCHAR(12);
DECLARE @c2 AS VARCHAR(250);
DECLARE @c3 AS VARCHAR(250);


DECLARE MY_cursor CURSOR STATIC FOR
Select
c1,
c2,
c3
from T2
where ....;

OPEN MY_cursor
FETCH NEXT FROM MY_cursor INTO @c1, @c2, @c3

WHILE @@FETCH_STATUS = 0
BEGIN
    if (select count(1) 
        from T1
        where a1 = @c1
        and a2 = @c2
        ) = 0 
            INSERT INTO T1
            values (@c1, @c2, @c3)

    FETCH NEXT FROM MY_cursor INTO @c1, @c2, @c3
END
CLOSE MY_cursor
DEALLOCATE MY_cursor

I used a MERGE query to fill a table without duplications.我使用 MERGE 查询来填充没有重复的表。 The problem I had was a double key in the tables ( Code , Value ) , and the exists query was very slow The MERGE executed very fast ( more then X100 )我遇到的问题是表中的双键(代码、值),并且存在的查询非常慢 MERGE 执行得非常快(超过 X100)

examples for MERGE query MERGE 查询示例

For one table it works perfectly when creating one unique index from multiple field.对于一个表,它在从多个字段创建一个唯一索引时完美运行。 Then simple "INSERT IGNORE" will ignore duplicates if ALL of 7 fields (in this case) will have SAME values.如果所有 7 个字段(在这种情况下)都具有相同的值,那么简单的“INSERT IGNORE”将忽略重复项。

Select fields in PMA Structure View and click Unique, new combined index will be created.在 PMA 结构视图中选择字段并单击唯一,将创建新的组合索引。

在此处输入图像描述

A simple DELETE before the INSERT would suffice:INSERT之前一个简单的DELETE就足够了:

DELETE FROM Table2 WHERE Id = (SELECT Id FROM Table1)
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1

Switching Table1 for Table2 depending on which table's Id and name pairing you want to preserve.根据您要保留的表的Idname配对,将Table1切换为Table2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM