Sometimes I have to face removing many rows at one time. It costs large amount of time (for example removing 3000 elements + ~21000 related elements costs me approx. 4 minutes).
The way I do it:
// removes settlement elements
var settlementElements = context.SettlementElements.Where(x => x.SettlementElementSettlementId == id); context.SettlementElements.RemoveRange(settlementElements);
I wonder if getting only IDs of those elements, create empty objects with those IDs, attach them and remove would be faster than remove range used above? I mean sth like this:
var settlementElementsIds = context.SettlementElements.Where(x => x.SettlementElementSettlementId == id).Select(x => x.SettlementElementId);
foreach (var seID in settlementElementsIds)
{
var obj = new SettlementElements() { SettlementElementId = seID };
context.SettlementElements.Attach(obj);
context.SettlementElements.Remove(obj);
}
Do you have some experience in optimizing elements removal?
TL;DR
If you want proper bulk delete functionality, use EFPlus package: http://entityframework-plus.net/
Details
As @Tom Lawrence wrote, using RemoveRange
would improve the performance of the in-memory loop. Yet the code above will cause retrieval of the records into memory, and then EF will generate separate DELETE
statement per each record being deleted, as shown in the example below:
context.Database.Log = Console.Write;
var id = 111;
var settlementElements = context.SettlementElements
.Where(x => x.SettlementElementSettlementId == id);
context.SettlementElements.RemoveRange(settlementElements);
context.SaveChanges();
would produce this output, assuming there are 3 records to delete from settlement ID=111:
Opened connection at 10/25/2017 2:08:33 PM +03:00
SELECT
[Extent1].[SettlementElementId] AS [SettlementElementId],
[Extent1].[SettlementElementSettlementId] AS [SettlementElementSettlementId],
[Extent1].[Description] AS [Description]
FROM [dbo].[SettlementElements] AS [Extent1]
WHERE [Extent1].[SettlementElementSettlementId] = @p__linq__0
-- p__linq__0: '111' (Type = Int32, IsNullable = false)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 5 ms with result: SqlDataReader
Closed connection at 10/25/2017 2:08:33 PM +03:00
Opened connection at 10/25/2017 2:08:33 PM +03:00
Started transaction at 10/25/2017 2:08:33 PM +03:00
DELETE [dbo].[SettlementElements]
WHERE ([SettlementElementId] = @0)
-- @0: '1111' (Type = Int32)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 1 ms with result: 1
DELETE [dbo].[SettlementElements]
WHERE ([SettlementElementId] = @0)
-- @0: '1112' (Type = Int32)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 0 ms with result: 1
DELETE [dbo].[SettlementElements]
WHERE ([SettlementElementId] = @0)
-- @0: '1113' (Type = Int32)
-- Executing at 10/25/2017 2:08:33 PM +03:00
-- Completed in 0 ms with result: 1
Committed transaction at 10/25/2017 2:08:33 PM +03:00
Closed connection at 10/25/2017 2:08:33 PM +03:00
This is definitely not an option if we have millions of records to delete!
With EFPlus, the following code:
using Z.EntityFramework.Plus;
// ...
context.Database.Log = Console.Write;
var id = 111;
var settlementElements = context.SettlementElements
.Where(x => x.SettlementElementSettlementId == id);
settlementElements.Delete(); // <-- proper bulk delete method
context.SaveChanges();
will produce this output for the same example data:
Opened connection at 10/25/2017 2:20:29 PM +03:00
DECLARE @rowAffected INT
DECLARE @totalRowAffected INT
SET @totalRowAffected = 0
WHILE @rowAffected IS NULL
OR @rowAffected > 0
BEGIN
DELETE TOP (4000)
FROM A
FROM [dbo].[SettlementElements] AS A
INNER JOIN ( SELECT
[Extent1].[SettlementElementId] AS [SettlementElementId]
FROM [dbo].[SettlementElements] AS [Extent1]
WHERE [Extent1].[SettlementElementSettlementId] = @p__linq__0
) AS B ON A.[SettlementElementId] = B.[SettlementElementId]
SET @rowAffected = @@ROWCOUNT
SET @totalRowAffected = @totalRowAffected + @rowAffected
END
SELECT @totalRowAffected
-- p__linq__0: '111' (Type = Int32, IsNullable = false)
-- Executing at 10/25/2017 2:20:29 PM +03:00
-- Completed in 2 ms with result: 3
Closed connection at 10/25/2017 2:20:29 PM +03:00
We can see that:
SELECT
statement is gone; there is no fetch into memory DELETE
statement, and not one per record Whenever I have had to do any form of bulk processing, I have found speed improvements by doing it in a stored procedure. You could run a quick test by creating a delete and compare, I can help with that if a stored procedure is an option for you.
You could also try using RemoveRange instead of a stored procedure if available in your version (ef6)
db.SettleElements.RemoveRange(itemstodelete);
db.SaveChanges();
The docs say RemoveRange may perform significantly better than calling Remove multiple times would do.
Source https://msdn.microsoft.com/en-us/library/system.data.entity.dbset.removerange(v=vs.113).aspx
As @felix-b answered,
The fastest way if you can use it, is using my free library Entity Framework Plus + BatchDelete
.
Another technic, if you already have all entities you want to delete is using BulkDelete
.
Disclaimer : I'm the owner of the project Entity Framework Extensions
This library is not free but allows you to perform all bulk operations including BulkDelete required by your application:
Example:
// Easy to use
context.BulkSaveChanges();
// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);
// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);
// Customize Primary Key
context.BulkMerge(customers, operation => {
operation.ColumnPrimaryKeyExpression =
customer => customer.Code;
});
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.