简体   繁体   English

实体框架4.0批量操作的性能

[英]Entity framework 4.0 performance of bulk operations

I'm currently in the process of improving the performance of an existing c# project that uses entity framework (and has 4.0 version). 我目前正在改进使用实体框架(并且具有4.0版本)的现有c#项目的性能。 There are 2 types of bulk operations done in the application: 在应用程序中完成了两种类型的批量操作:

  1. Bulk inserts 批量插入
  2. Bulk deletes 批量删除

Currently they are done using pure SQL statements ("INSERT INTO...", "DELETE FROM...") (the insert statement itself is currently no bulk insert statement, but a "normal" insert statement instead). 目前它们是使用纯SQL语句(“INSERT INTO ...”,“DELETE FROM ...”)完成的(insert语句本身当前不是批量插入语句,而是“普通”插入语句)。

As I'm pretty new to C# my first step was to look around how the performance would be if I use the entity framework for the updates and deletes. 由于我对C#很陌生,我的第一步是看看如果我使用实体框架进行更新和删除,性能将如何。

My question here is three folded: 我的问题是三折:

  1. Is it true that if I try to do a bulk insert with the entity framework that it does use 1 insert per data row inserted? 是否真的如果我尝试使用实体框架进行批量插入,它确实每插入一个数据行使用1个插入? (thus a roundtrip for each insert). (因此每个插入物的往返)。 And thus that the performance is less than using an SQL "insert"? 因此性能低于使用SQL“插入”?
  2. Does this also hold true for delete statements? 这对删除语句也适用吗?
  3. What is the best practice here? 这里的最佳做法是什么? To use SQL-statements? 要使用SQL语句? Or to use the entity framework, or something else entirely? 或者完全使用实体框架或其他东西?

(for the data rows I'm talking about sizes of 2k-200k each). (对于数据行,我说的是每个2k-200k的大小)。

Thanks 谢谢

My question here is three folded: 我的问题是三折:

Is it true that if I try to do a bulk insert with the entity framework that it does use 1 insert per data row inserted? 是否真的如果我尝试使用实体框架进行批量插入,它确实每插入一个数据行使用1个插入? (thus a roundtrip for each insert). (因此每个插入物的往返)。 And thus that the performance is less than using an SQL "insert"? 因此性能低于使用SQL“插入”?

Yes. 是。 If you want to use EF set the following properties to false to get faster performance: 如果要使用EF,请将以下属性设置为false以获得更快的性能:

MyContext.Configuration.AutoDetectChangesEnabled = false;
MyContext.Configuration.ValidateOnSaveEnabled = false;

Does this also hold true for delete statements? 这对删除语句也适用吗?

Yes. 是。 Also you can define on the database On Delete Cascade then the database will delete the referenced entities so there is no need to do this using EF. 您还可以在数据库定义删除级联,然后数据库将删除引用的实体,因此不需要使用EF执行此操作。

What is the best practice here? 这里的最佳做法是什么? To use SQL-statements? 要使用SQL语句? Or to use the entity framework, or something else entirely? 或者完全使用实体框架或其他东西?

You can use stored procedure, call Query on your Context 您可以使用存储过程,在Context上调用Query

MyContext.ExecuteStoreQuery("your query")

or 要么

MyContext.Database.SqlCommand("your query"); 

The other approach is to call SaveChanges() after a batch (100, 200 entities marked as Added or Deleted) and then dispose the context so that the entities aren't still attached. 另一种方法是在批处理之后调用SaveChanges() (100,200个实体标记为已添加或已删除),然后处置上下文,以便实体仍未连接。 Then create a new context make a batch and call SaveChanges() again. 然后创建一个新的上下文创建批处理并再次调用SaveChanges()

UPDATE UPDATE

I didn't use this approach but you can try it out. 我没有使用这种方法,但你可以尝试一下。

SqlBulkCopy for Generic List (useful for Entity Framework & NHibernate) 用于通用列表的SqlBulkCopy(对于Entity Framework和NHibernate很有用)

Reusable generic version below, which produced 15k inserts in 2.4s or +- 6200 rows per second. 下面是可重复使用的通用版本,它以2.4秒或+ - 6200行/秒的速度生成15k插入。 I upped it to 4 catalogs, 224392 rows in 39s, for +- 5750 rps (changing between 4 files). 我把它增加到4个目录,3939个224392行,+ - 5750 rps(在4个文件之间切换)。

If you truly want to insert bulk data you probably want to use SqlBulkCopy . 如果您确实想要插入批量数据,则可能需要使用SqlBulkCopy You can use it in the same transaction that your EF context uses. 您可以在EF上下文使用的同一事务中使用它。

EF is not made for bulk operations and you might find its single-row-per-statement DML approach to be too restrictive for large sets. EF不是为批量操作而制作的,您可能会发现其单行每语句DML方法对大型集合来说过于严格。 It forces lots of round-trips, lots of per-statement overheads and prevents SQL Server from optimizing a query plan for many rows at once which is almost always more efficient than many small queries (for example SQL Server will properly sort all the rows so that indexes can be updates sequentially). 它会强制进行大量的往返,大量的每语句开销,并阻止SQL Server同时优化多行的查询计划,这几乎总是比许多小查询更有效(例如,SQL Server将对所有行进行正确排序,以便索引可以按顺序更新)。

By using EF to do bulk DML you basically force SQL Server to use per-row DML plans. 通过使用EF来执行批量DML,您基本上强制SQL Server使用每行DML计划。

Bulk deletes can be handled by bulk-inserting the keys into a temp table and then executing a delete statement joining to that table. 可以通过将密钥大量插入临时表,然后执行连接到该表的delete语句来处理批量删除。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM