简体   繁体   中英

.NET, Entity Framework: objects removal - which is faster/better?

Sometimes I have to face removing many rows at one time. It costs large amount of time (for example removing 3000 elements + ~21000 related elements costs me approx. 4 minutes).

The way I do it:

// removes settlement elements
var settlementElements = context.SettlementElements.Where(x => x.SettlementElementSettlementId == id);                     context.SettlementElements.RemoveRange(settlementElements);

I wonder if getting only IDs of those elements, create empty objects with those IDs, attach them and remove would be faster than remove range used above? I mean sth like this:

var settlementElementsIds = context.SettlementElements.Where(x => x.SettlementElementSettlementId == id).Select(x => x.SettlementElementId);
foreach (var seID in settlementElementsIds)
{
    var obj = new SettlementElements() { SettlementElementId = seID };
    context.SettlementElements.Attach(obj);
    context.SettlementElements.Remove(obj);
}

Do you have some experience in optimizing elements removal?

TL;DR

If you want proper bulk delete functionality, use EFPlus package: http://entityframework-plus.net/

Details

As @Tom Lawrence wrote, using RemoveRange would improve the performance of the in-memory loop. Yet the code above will cause retrieval of the records into memory, and then EF will generate separate DELETE statement per each record being deleted, as shown in the example below:

context.Database.Log = Console.Write;
var id = 111;
var settlementElements = context.SettlementElements
    .Where(x => x.SettlementElementSettlementId == id);
context.SettlementElements.RemoveRange(settlementElements);
context.SaveChanges();

would produce this output, assuming there are 3 records to delete from settlement ID=111:

Opened connection at 10/25/2017 2:08:33 PM +03:00
SELECT
    [Extent1].[SettlementElementId] AS [SettlementElementId],
    [Extent1].[SettlementElementSettlementId] AS [SettlementElementSettlementId],
    [Extent1].[Description] AS [Description]
    FROM [dbo].[SettlementElements] AS [Extent1]
    WHERE [Extent1].[SettlementElementSettlementId] = @p__linq__0
-- p__linq__0: '111' (Type = Int32, IsNullable = false)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 5 ms with result: SqlDataReader

Closed connection at 10/25/2017 2:08:33 PM +03:00
Opened connection at 10/25/2017 2:08:33 PM +03:00
Started transaction at 10/25/2017 2:08:33 PM +03:00
DELETE [dbo].[SettlementElements]
WHERE ([SettlementElementId] = @0)
-- @0: '1111' (Type = Int32)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 1 ms with result: 1

DELETE [dbo].[SettlementElements]
WHERE ([SettlementElementId] = @0)
-- @0: '1112' (Type = Int32)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 0 ms with result: 1

DELETE [dbo].[SettlementElements]
WHERE ([SettlementElementId] = @0)
-- @0: '1113' (Type = Int32)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 0 ms with result: 1

Committed transaction at 10/25/2017 2:08:33 PM +03:00
Closed connection at 10/25/2017 2:08:33 PM +03:00

This is definitely not an option if we have millions of records to delete!

With EFPlus, the following code:

using Z.EntityFramework.Plus;

// ...

context.Database.Log = Console.Write;
var id = 111;
var settlementElements = context.SettlementElements
    .Where(x => x.SettlementElementSettlementId == id);
settlementElements.Delete(); // <-- proper bulk delete method
context.SaveChanges();

will produce this output for the same example data:

Opened connection at 10/25/2017 2:20:29 PM +03:00

DECLARE @rowAffected INT
DECLARE @totalRowAffected INT

SET @totalRowAffected = 0

WHILE @rowAffected IS NULL
    OR @rowAffected > 0
    BEGIN
        DELETE TOP (4000)
        FROM    A
        FROM    [dbo].[SettlementElements] AS A
                INNER JOIN ( SELECT
    [Extent1].[SettlementElementId] AS [SettlementElementId]
    FROM [dbo].[SettlementElements] AS [Extent1]
    WHERE [Extent1].[SettlementElementSettlementId] = @p__linq__0
                           ) AS B ON A.[SettlementElementId] = B.[SettlementElementId]

        SET @rowAffected = @@ROWCOUNT
        SET @totalRowAffected = @totalRowAffected + @rowAffected
    END

SELECT  @totalRowAffected
-- p__linq__0: '111' (Type = Int32, IsNullable = false)
-- Executing at 10/25/2017 2:20:29 PM +03:00
-- Completed in 2 ms with result: 3

Closed connection at 10/25/2017 2:20:29 PM +03:00

We can see that:

  • the SELECT statement is gone; there is no fetch into memory
  • there is single DELETE statement, and not one per record

Whenever I have had to do any form of bulk processing, I have found speed improvements by doing it in a stored procedure. You could run a quick test by creating a delete and compare, I can help with that if a stored procedure is an option for you.

You could also try using RemoveRange instead of a stored procedure if available in your version (ef6)

db.SettleElements.RemoveRange(itemstodelete);
db.SaveChanges();

The docs say RemoveRange may perform significantly better than calling Remove multiple times would do.

Source https://msdn.microsoft.com/en-us/library/system.data.entity.dbset.removerange(v=vs.113).aspx

As @felix-b answered,

The fastest way if you can use it, is using my free library Entity Framework Plus + BatchDelete .

Another technic, if you already have all entities you want to delete is using BulkDelete .

Disclaimer : I'm the owner of the project Entity Framework Extensions

This library is not free but allows you to perform all bulk operations including BulkDelete required by your application:

  • BulkSaveChanges
  • BulkInsert
  • BulkUpdate
  • BulkDelete
  • BulkMerge
  • BulkSynchronize

Example:

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);

// Customize Primary Key
context.BulkMerge(customers, operation => {
   operation.ColumnPrimaryKeyExpression = 
        customer => customer.Code;
});

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM