简体   繁体   English

DELETE查询非常慢

[英]Very slow DELETE query

I have problems with SQL performance. 我有SQL性能问题。 For sudden reason the following queries are very slow: 出于突发原因,以下查询非常缓慢:

I have two lists which contains Id's of a certain table. 我有两个列表,其中包含某个表的Id。 I need to delete all records from the first list if the Id's already exists in the second list: 如果Id已存在于第二个列表中,我需要删除第一个列表中的所有记录:

DECLARE @IdList1 TABLE(Id INT)
DECLARE @IdList2 TABLE(Id INT)

-- Approach 1
DELETE list1
FROM @IdList1 list1
INNER JOIN @IdList2 list2 ON list1.Id = list2.Id

-- Approach 2
DELETE FROM @IdList1
WHERE Id IN (SELECT Id FROM @IdList2)

It is possible the two lists contains more than 10.000 records. 这两个列表可能包含超过10,000条记录。 In that case both queries takes each more than 20 seconds to execute. 在这种情况下,两个查询都需要超过20秒才能执行。

The execution plan also showed something I don't understand. 执行计划也显示了一些我不理解的东西。 Maybe that explains why it is so slow: 也许这就解释了为什么它如此缓慢: 查询两个查询

I Filled both lists with 10.000 sequential integers so both list contained value 1-10.000 as starting point. 我用10,000个连续的整数填充了两个列表,因此两个列表都包含值1-10.000作为起始点。

As you can see both queries shows for @IdList2 Actual Number of Rows is 50.005.000!!. 正如您所看到的,两个查询显示@ IdList2 实际行数为50.005.000 !!。 @IdList1 is correct ( Actual Number of Rows is 10.000) @ IdList1是正确的( 实际行数是10.000)

I know there are other solutions how to solve this. 我知道还有其他解决方案如何解决这个问题。 Like filling a third list instaed of removing from first list. 就像填写从第一个列表中删除的第三个列表一样。 But my question is: 但我的问题是:

Why are these delete queries so slow and why do I see these strange query plans? 为什么这些删除查询这么慢,为什么我会看到这些奇怪的查询计划?

Add a Primary key to your table variables and watch them scream 向表变量添加主键并观察它们的尖叫声

DECLARE @IdList1 TABLE(Id INT primary Key not null)
DECLARE @IdList2 TABLE(Id INT primary Key not null)

because there's no index on these table variables, any joins or subqueries must examine on the order of 10,000 times 10,000 = 100,000,000 pairs of values. 因为这些表变量没有索引,所以任何连接或子查询必须检查10,000次10,000 = 100,000,000对值的顺序。

SQL Server compiles the plan when the table variable is empty and does not recompile it when rows are added. SQL Server在表变量为空时编译计划,并且在添加行时不重新编译它。 Try 尝试

DELETE FROM @IdList1
WHERE Id IN (SELECT Id FROM @IdList2)
OPTION (RECOMPILE)

This will take account of the actual number of rows contained in the table variable and get rid of the nested loops plan 这将考虑表变量中包含的实际行数并删除嵌套循环计划

Of course creating an index on Id via a constraint may well be beneficial for other queries using the table variable too. 当然,通过约束在Id上创建索引也可能对使用表变量的其他查询有益。

The tables in table variables can have primary keys, so if your data supports uniqueness for these Id s, you may be able to improve performance by going for 表变量中的表可以有主键,因此如果您的数据支持这些Id的唯一性,您可以通过

DECLARE @IdList1 TABLE(Id INT PRIMARY KEY)
DECLARE @IdList2 TABLE(Id INT PRIMARY KEY)

Possible solutions: 可能的解决方案:

1) Try to create indices thus 1)尝试创建索引

1.1) If List{1|2}.Id column has unique values then you could define a unique clustered index using a PK constraint like this: 1.1)如果List {1 | 2} .Id列具有唯一值,那么您可以使用PK约束定义唯一的聚簇索引,如下所示:

DECLARE @IdList1 TABLE(Id INT PRIMARY KEY);
DECLARE @IdList2 TABLE(Id INT PRIMARY KEY);

1.2) If List{1|2}.Id column may have duplicate values then you could define a unique clustered index using a PK constraint using a dummy IDENTITY column like this: 1.2)如果List {1 | 2} .Id列可能具有重复值,那么您可以使用伪IDENTITY列使用PK约束定义唯一的聚簇索引,如下所示:

DECLARE @IdList1 TABLE(Id INT, DummyID INT IDENTITY, PRIMARY KEY (ID, DummyID) );
DECLARE @IdList2 TABLE(Id INT, DummyID INT IDENTITY, PRIMARY KEY (ID, DummyID) );

2) Try to add HASH JOIN query hint like this: 2)尝试添加HASH JOIN查询提示,如下所示:

DELETE list1
FROM @IdList1 list1
INNER JOIN @IdList2 list2 ON list1.Id = list2.Id
OPTION (HASH JOIN);

You are using Table Variables , either add a primary key to the table or change them to Temporary Tables and add an INDEX . 您正在使用Table Variables ,要么将主键添加到Table Variables ,要么将它们更改为Temporary Tables并添加INDEX This will result in much more performance. 这将带来更多的性能。 As a rule of thumb, if the table is only small, use TABLE Variables , however if the table is expanding and contains a lot of data then either use a temp table. 根据经验,如果表只是小的,请使用TABLE Variables ,但是如果表正在扩展并包含大量数据,则使用临时表。

I'd be tempted to try 我很想尝试

DECLARE @IdList3 TABLE(Id INT);

INSERT @IdList3
SELECT Id FROM @IDList1 ORDER BY Id
EXCEPT
SELECT Id FROM @IDList2 ORDER BY Id

No deleting required. 不需要删除。

Try this alternate syntax: 尝试这种替代语法:

DELETE deleteAlias
FROM @IdList1 deleteAlias
WHERE EXISTS (
        SELECT NULL
        FROM @IdList2 innerList2Alias
        WHERE innerList2Alias.id=deleteAlias.id
    )

EDIT..................... 编辑.....................

Try using #temp tables with indexes instead. 尝试使用带有索引的#temp表。

Here is a generic example where "DepartmentKey" is the PK and the FK. 这是一个通用示例,其中“DepartmentKey”是PK和FK。

IF OBJECT_ID('tempdb..#Department') IS NOT NULL
begin
        drop table #Department
end


CREATE TABLE #Department 
( 
    DepartmentKey int , 
    DepartmentName  varchar(12)
)



CREATE INDEX IX_TEMPTABLE_Department_DepartmentKey ON #Department (DepartmentKey)




IF OBJECT_ID('tempdb..#Employee') IS NOT NULL
begin
        drop table #Employee
end


CREATE TABLE #Employee 
( 
    EmployeeKey int , 
    DepartmentKey int ,
    SSN  varchar(11)
)



CREATE INDEX IX_TEMPTABLE_Employee_DepartmentKey ON #Employee (DepartmentKey)


Delete deleteAlias 
from #Department deleteAlias
where exists ( select null from #Employee innerE where innerE.DepartmentKey = deleteAlias.DepartmentKey )





IF OBJECT_ID('tempdb..#Employee') IS NOT NULL
begin
        drop table #Employee
end

IF OBJECT_ID('tempdb..#Department') IS NOT NULL
begin
        drop table #Department
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM