[英]SQL Server delete performance
I have a routine in our .NET web application that allows a user on our platform to clear their account (ie delete all their data).我在我们的 .NET web 应用程序中有一个例程,它允许我们平台上的用户清除他们的帐户(即删除他们的所有数据)。 This routine runs in a stored procedure and essentially loops through the relevant data tables and clears down all the various items they have created.这个例程在一个存储过程中运行,基本上循环遍历相关数据表并清除它们创建的所有各种项目。
The stored procedure looks something like this.存储过程看起来像这样。
ALTER procedure [dbo].[spDeleteAccountData](
@accountNumber varchar(30) )
AS
BEGIN
SET ANSI_NULLS ON ;
SET NOCOUNT ON;
BEGIN TRAN
BEGIN TRY
DELETE FROM myDataTable1 WHERE accountNumber = @accountNumber
DELETE FROM myDataTable2 WHERE accountNumber = @accountNumber
DELETE FROM myDataTable3 WHERE accountNumber = @accountNumber
//Etc.........
END TRY
BEGIN CATCH
//CATCH ERROR
END CATCH
IF @@TRANCOUNT > 0
COMMIT TRANSACTION;
SET ANSI_NULLS OFF;
SET NOCOUNT OFF;
END
The problem is that in some cases we can have over 10,000 rows on a table and the procedure can take up to 3-5 minutes.问题是在某些情况下,我们可以在一个表上拥有超过 10,000 行,并且该过程可能需要 3-5 分钟。 During this period all the other connections on the database get throttled causing time-out errors like the one below:在此期间,数据库上的所有其他连接都受到限制,导致超时错误,如下所示:
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. System.Data.SqlClient.SqlException (0x80131904):超时已过期。 The timeout period elapsed prior to completion of the operation or the server is not responding.操作完成前超时时间已过或服务器未响应。
Are there any general changes I can make to improve performance?我是否可以进行任何常规更改来提高性能? I appreciate there are many unknowns related to the design of our database schema, but general best practice advice would be welcomed!我很感激有许多与我们的数据库架构设计相关的未知数,但欢迎提供一般的最佳实践建议! I thought about scheduling this task to run during the early hours to minimise impact, but this is far from Ideal as the user wouldn't be able to regain access to their account until this task had been completed.我考虑过将此任务安排在早期运行以最大程度地减少影响,但这远非理想,因为在此任务完成之前,用户将无法重新获得对其帐户的访问权限。
Additional Information:附加信息:
Edit: 16:52 GMT编辑:格林威治标准时间 16:52
The delete proc affects around 20 tables.删除过程影响大约 20 个表。 The largest one has approx 5 million records.最大的一个有大约 500 万条记录。 The others have no more the 200,000, with some containing only 1000-2000 records.其他的没有更多的 200,000,有些只包含 1000-2000 条记录。
Do you have an index on accountNumber
in all tables ?所有表中的accountNumber
都有索引吗?
Seeing that you delete using a WHERE
clause by that column, this might help.看到您使用该列的WHERE
子句删除,这可能会有所帮助。
Another option (and probably even better solution) would be to schedule deletion operations at night, eg when user selects to delete his account, you're only setting a flag, and a delete job runs at night actually deleting those accounts flagged for deletion.另一种选择(可能甚至更好的解决方案)是在晚上安排删除操作,例如,当用户选择删除他的帐户时,您只是设置了一个标志,而在晚上运行的删除作业实际上删除了那些标记为删除的帐户。
If you have an index on the accountNumber field then I guess the long time for deletion is due to locks (generated by other processes) or to foreign keys affected by the respective tables.如果您在 accountNumber 字段上有索引,那么我想删除的时间很长是由于锁定(由其他进程生成)或受相应表影响的外键。
Off course purists will blame me for the latter but I had been using this a lot of times when need arises.当然,纯粹主义者会因为后者而责怪我,但在需要时我已经多次使用它。
SqlCommand.CommandTimeout is the short answer. SqlCommand.CommandTimeout 是简短的答案。 Increase its value.增加它的价值。
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.commandtimeout.aspx http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.commandtimeout.aspx
Note, the Connection Timeout is not the same thing as the CommandTimeout.请注意,连接超时与 CommandTimeout 不同。
... ...
Do you have an index on "accountNumber" on each table?每个表上的“accountNumber”都有索引吗?
You could have a clustered key on the surrogate-key of the table, but not the "accountNumber".你可以在表的代理键上有一个聚集键,但不是“accountNumber”。
... ...
Basically, you're gonna have to look at the execution plan (or post the execution plan) here.基本上,您将不得不在此处查看执行计划(或发布执行计划)。
But here is some "starter code" for trying an index on that column(s).但这里有一些“入门代码”,用于在该列上尝试索引。
if exists (select * from dbo.sysindexes where name = N'IX_myDataTable1_accountNumber' and id = object_id(N'[dbo].[myDataTable1]'))
DROP INDEX [dbo].[myDataTable1].[IX_myDataTable1_accountNumber]
GO
CREATE INDEX [IX_myDataTable1_accountNumber] ON [dbo].[myDataTable1]([accountNumber])
GO
It could be worth switching the database into Read Committed Snapshot mode.可能值得将数据库切换到 Read Committed Snapshot 模式。 This will have a performance impact, how much depends on your application.这会对性能产生影响,多少取决于您的应用程序。
In Read Committed Snapshot mode, writers and readers no longer block each other, although writers still block writers.在 Read Committed Snapshot 模式下,写入者和读取者不再相互阻止,尽管写入者仍然阻止写入者。 You don't say what sort of activity on the table is getting prevented by the delete, so it's a little hard to say if this will help?您没有说明删除会阻止表上的哪种活动,因此很难说这是否会有所帮助?
http://msdn.microsoft.com/en-us/library/ms188277(v=sql.105).aspx http://msdn.microsoft.com/en-us/library/ms188277(v=sql.105).aspx
Having said that, 3-5 minutes for a deletion on tables with ~10k rows seems absurdly slow.话虽如此,删除大约 10k 行的表的 3-5 分钟似乎慢得荒谬。 You mention foreign keys, are the foreign keys indexed?你提到外键,外键有索引吗? If not, deletion can cause table scans on the other end to make sure you're not breaking RI, so maybe check that first?如果没有,删除会导致另一端的表扫描以确保您没有破坏 RI,所以也许先检查一下? What does SQL Server Profiler say for reads/writes for these deletion queries? SQL Server Profiler 对这些删除查询的读/写有何看法?
One way you might want to try is this:您可能想尝试的一种方法是:
Example:例子:
DECLARE @DeletedRowsCount INT = 1, @BatchSize INT = 300;
WHILE (@DeletedRowsCount> 0) BEGIN
BEGIN TRANSACTION
DELETE TOP (@BatchSize) dbo.Table
FROM dbo.Table
WHERE Id = @PortalId;
SET @DeletedRowsCount = @@ROWCOUNT;
COMMIT;
WAITFOR DELAY '00:00:05';
END
I guess you can do the same without a SP as well.我想你也可以在没有 SP 的情况下做同样的事情。 In fact, it might be better like that.事实上,这样可能更好。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.