简体   繁体   中英

SQL Server delete performance

I have a routine in our .NET web application that allows a user on our platform to clear their account (ie delete all their data). This routine runs in a stored procedure and essentially loops through the relevant data tables and clears down all the various items they have created.

The stored procedure looks something like this.

ALTER procedure [dbo].[spDeleteAccountData](
    @accountNumber varchar(30) ) 
AS
BEGIN
    SET ANSI_NULLS ON ;
    SET NOCOUNT ON;

    BEGIN TRAN  
    BEGIN TRY
        DELETE FROM myDataTable1 WHERE accountNumber = @accountNumber
        DELETE FROM myDataTable2 WHERE accountNumber = @accountNumber
        DELETE FROM myDataTable3 WHERE accountNumber = @accountNumber
        //Etc.........

    END TRY
    BEGIN CATCH
        //CATCH ERROR
    END CATCH

IF @@TRANCOUNT > 0
    COMMIT TRANSACTION; 
SET ANSI_NULLS OFF;
SET NOCOUNT OFF;
END

The problem is that in some cases we can have over 10,000 rows on a table and the procedure can take up to 3-5 minutes. During this period all the other connections on the database get throttled causing time-out errors like the one below:

System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

Are there any general changes I can make to improve performance? I appreciate there are many unknowns related to the design of our database schema, but general best practice advice would be welcomed! I thought about scheduling this task to run during the early hours to minimise impact, but this is far from Ideal as the user wouldn't be able to regain access to their account until this task had been completed.

Additional Information:

  • SQL Server 2008 R2 Standard
  • All tables have a clustered index
  • No triggers have been associated to any delete commands on any of the relevant tables
  • Foreign key references exist on a number of tables but the deletion order accounts for this.

Edit: 16:52 GMT

The delete proc affects around 20 tables. The largest one has approx 5 million records. The others have no more the 200,000, with some containing only 1000-2000 records.

Do you have an index on accountNumber in all tables ?

Seeing that you delete using a WHERE clause by that column, this might help.

Another option (and probably even better solution) would be to schedule deletion operations at night, eg when user selects to delete his account, you're only setting a flag, and a delete job runs at night actually deleting those accounts flagged for deletion.

If you have an index on the accountNumber field then I guess the long time for deletion is due to locks (generated by other processes) or to foreign keys affected by the respective tables.

  1. If is due to locks then you should see if you can reduce them using nolock where you can actually do that.
  2. if there is a problem of foreign keys .. well you have to wait .. If you do not want to wait though and your application logic does not rely on enforcing the FKs (like sending errors to the application for FK violations, and testing against them) or you feel your application is perfect and then for a short period of time you do not need FKs, then you can disable related FKs prior to deletions with ALTER TABLE xxx NOCHECK CONSTRAINT all and then re enable it.

Off course purists will blame me for the latter but I had been using this a lot of times when need arises.

SqlCommand.CommandTimeout is the short answer. Increase its value.

http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.commandtimeout.aspx

Note, the Connection Timeout is not the same thing as the CommandTimeout.

...

Do you have an index on "accountNumber" on each table?

You could have a clustered key on the surrogate-key of the table, but not the "accountNumber".

...

Basically, you're gonna have to look at the execution plan (or post the execution plan) here.

But here is some "starter code" for trying an index on that column(s).

if exists (select * from dbo.sysindexes where name = N'IX_myDataTable1_accountNumber' and id = object_id(N'[dbo].[myDataTable1]'))
    DROP INDEX [dbo].[myDataTable1].[IX_myDataTable1_accountNumber]
GO

CREATE INDEX [IX_myDataTable1_accountNumber] ON [dbo].[myDataTable1]([accountNumber]) 
GO

It could be worth switching the database into Read Committed Snapshot mode. This will have a performance impact, how much depends on your application.

In Read Committed Snapshot mode, writers and readers no longer block each other, although writers still block writers. You don't say what sort of activity on the table is getting prevented by the delete, so it's a little hard to say if this will help?

http://msdn.microsoft.com/en-us/library/ms188277(v=sql.105).aspx

Having said that, 3-5 minutes for a deletion on tables with ~10k rows seems absurdly slow. You mention foreign keys, are the foreign keys indexed? If not, deletion can cause table scans on the other end to make sure you're not breaking RI, so maybe check that first? What does SQL Server Profiler say for reads/writes for these deletion queries?

One way you might want to try is this:

  1. Create a SP.
  2. For each table, delete rows in small batches of some size that works for you (say 10 rows per batch).
  3. Put each batch deletion inside a transaction and add a custom delay between each transaction.

Example:

    DECLARE @DeletedRowsCount INT = 1, @BatchSize INT = 300;
    WHILE (@DeletedRowsCount> 0) BEGIN
        BEGIN TRANSACTION  
            DELETE TOP (@BatchSize) dbo.Table
            FROM dbo.Table
            WHERE Id = @PortalId;
            SET @DeletedRowsCount = @@ROWCOUNT;
        COMMIT;

        WAITFOR DELAY '00:00:05';
    END

I guess you can do the same without a SP as well. In fact, it might be better like that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM