简体   繁体   English

实体框架:如何提高批量更新性能?

[英]Entity Framework: how to improve bulk update performance?

I have some code that does some calculations and based on that it updates a column in one table with a new value.我有一些代码可以进行一些计算,并基于它用新值更新一个表中的列。 It's fast in the beginning, but over time it takes longer and longer (performance seems to degrade exponentially overtime)一开始它很快,但随着时间的推移它需要越来越长的时间(性能似乎随着时间的推移呈指数下降)

Is there a way to improve performance?有没有办法提高性能? By manually specifying what needs to be updated or similar?通过手动指定需要更新或类似的内容?

(So far my best way to tackle this issue was to create a stored procedure that works as a bulk update, but I'm wondering if there is a native way of doing this in Entity Framework) (到目前为止,我解决这个问题的最好方法是创建一个作为批量更新工作的存储过程,但我想知道在实体框架中是否有这样做的本机方法)

My code is something like:我的代码是这样的:

public void UpdateValues()
{
    var itemsPerBag = _dbContext.Items
                                .Where(i => i.needsToBeRecalculated)
                                .GroupBy(i => BagId)

    foreach (bag in itemsPerBag)
    {
        CalculateValue(bag); 
    }

    _dbContext.SaveChanges()
}

public void CalculateValue(IEnumerable<Item> bag)
{
    foreach (item in bag)
    {
        item.calculatedValue = CalculateValue(Item);
    }
}

It is not literally this, but I'm doing my updated per "group", not doing them one by one, to try to not make a commit too big neither too small.这不是字面上的意思,但我正在按“组”进行更新,而不是一个一个地进行更新,以尽量不要使提交太大也不太小。

I have around 850 "bags"/saves and 25000 items.我有大约 850 个“袋子”/保存和 25000 件物品。 and this is taking 1min to do 11000 updates and 4min to do the 25000 updates. 11000 次更新需要 1 分钟,25000 次更新需要 4 分钟。

I think this is a rather small amount of data that should be done much quicker, the calculations I'm doing are very simple.我认为这是一个相当小的数据量,应该更快地完成,我正在做的计算非常简单。

EDIT:编辑:

The only way I've managed to improve performance from 4minutes to 20 seconds was to create a stored procedure in the database to update the data, and call it instead of the SaveChanges() .我设法将性能从 4 分钟提高到 20 秒的唯一方法是在数据库中创建一个存储过程来更新数据,并调用它而不是SaveChanges()

private async Task UpdatePlanItems(IEnumerable<Item> items)
{
   SqlParameter param = new SqlParameter();
   param.ParameterName = "@Items";
   param.SqlDbType = SqlDbType.Structured;
   param.Value = GetItemsTable(items);
   param.TypeName = "dbo.ItemUpdateType";

   await _databaseStatement.ExecuteAsync("EXEC dbo.usp_UpdateItemValue {0}", param);
 }

 private DataTable GetItemsTable(IEnumerable<Item> items)
 {
    var table = new DataTable();
    table.Columns.Add("ItemId", typeof(int));
    table.Columns.Add("Value", typeof(int));

    foreach (var item in items)
    {
       var row = table.NewRow();
       row["ItemId"] = item.ItemId;
       row["Value"] = item.Value;
       table.Rows.Add(row);
     }

     return table;
  }

On the database I had to run this:在数据库上我必须运行这个:

CREATE TYPE [dbo].[ItemUpdateType] AS TABLE(
              [ItemId] [int] NULL,
              [Value] [int] NULL
)
GO

CREATE PROCEDURE [dbo].[usp_UpdateItemValue]
    (@PlanItems [dbo].ItemUpdateType READONLY) 
AS
BEGIN
    UPDATE p
    SET i.Value = s.Value
    FROM [dbo].[Item] i
    INNER JOIN @PlanItems s ON s.PlanItemId = i.PlanItemId
END

You should not save changes after every change and rather do it after doing all the changes (or at least in batches of 100/1000/...) so your code should look like this.您不应在每次更改后保存更改,而应在完成所有更改后(或至少以 100/1000/... 的批次)进行保存,因此您的代码应如下所示。 Otherwise you're making n db calls (1 per item) instead of just 1 (for all items)否则,您将进行 n 个数据库调用(每个项目 1 个)而不是仅 1 个(对于所有项目)

public void UpdateValues(){
  var itemsPerBag = _dbContext.Items.Where(i => i.needsToBeRecalculated)
                                 .GroupBy(i => BagId)

  foreach (bag in itemsPerBag){
    CalculateValue(bag);    
  }

  _dbContext.SaveChanges()
}

Also having too many changes without commiting to DB (not going in batches) may still be slow especially if you have a lot of dataitems in modified state and you may want to disable automatic change detection as it always checks every modified item and rather do the change detection manually at the end.如果没有提交到数据库(不分批进行)也有太多更改可能仍然很慢,特别是如果您在修改后的 state 中有很多数据项,并且您可能希望禁用自动更改检测,因为它总是检查每个修改的项目,而不是做最后手动更改检测 You may also need to re-enable automatic change detection if you're sharing the DbContext instance (which you should not)如果您正在共享 DbContext 实例(您不应该这样做),您可能还需要重新启用自动更改检测

// Turn off automatic change detection
_dbContext.Configuration.AutoDetectChangesEnabled = false;

// All your operations (calculation/updating/adding items/...)
AllYourUpdatesToItems();

// Manually call detect changes so EF's SaveChanges() actually commits something
_dbContext.ChangeTracker.DetectChanges();
_dbContext.SaveChanges();

Just try this.试试这个。 You don't need to call save changes after recalculating each item.您无需在重新计算每个项目后调用保存更改。 It would be enough to call once, after all recalculating is done.在所有重新计算完成后调用一次就足够了。

var itemsPerBag = _dbContext.Items.Where(i => i.needsToBeRecalculated)
                                 .GroupBy(i => i.BagId).ToArray();

foreach (bag in itemsPerBag) {CalculateValue(bag);}

  _dbContext.SaveChanges()

I have developed a project which benchmarks the bulk insert method in EF and other ORMs in.Net.我开发了一个项目,该项目对 EF 中的批量插入方法和 .Net 中的其他 ORM 进行基准测试。

Check it out Github Link看看Github 链接

In this project, I have implemented bulk insert in different ways and elapsed time was calculated for every technique.在这个项目中,我以不同的方式实现了批量插入,并计算了每种技术的运行时间。

4 different techniques for bulk insert by Entity framework and without it:实体框架批量插入的 4 种不同技术和没有它:

1- EFCore.BulkExtensions => https://github.com/borisdj/EFCore.BulkExtensions 1- EFCore.BulkExtensions => https://github.com/borisdj/EFCore.BulkExtensions

2- Bulk-Operations => https://github.com/zzzprojects/Bulk-Operations 2- 批量操作 => https://github.com/zzzprojects/Bulk-Operations

3- EF core AddRange => https://github.com/do.net/efcore 3- EF 核心 AddRange => https://github.com/do.net/efcore

4- Microsoft SqlBulkCopy => https://learn.microsoft.com/en-us/do.net/api/system.data.sqlclient.sqlbulkcopy?view=do.net-plat-ext-5.0 4- Microsoft SqlBulkCopy => https://learn.microsoft.com/en-us/do.net/api/system.data.sqlclient.sqlbulkcopy?view=do.net-plat-ext-5.0

EF7 now supports bulk update and delete. EF7 现在支持批量更新和删除。 See the details here 在这里查看详细信息

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM