简体   繁体   English

如何改进实体框架批量插入

[英]How to Improve Entity Framework bulk insert

I have an application which receives data from multiple sockets and then write the data into a DB. 我有一个应用程序,该应用程序从多个套接字接收数据,然后将数据写入数据库。

I am currently using EF to do this. 我目前正在使用EF来执行此操作。 I would like to know how I can make it more efficient. 我想知道如何提高效率。

I have read that doing a bulk insert is faster so I am only saving changes to the DB every 500 insters: 我已经读到,执行批量插入的速度更快,因此,我仅每500个inster将更改保存到数据库中一次:

   db.Logs_In.Add(tableItem);
            if (logBufferCounter++ > 500)
            {
                db.SaveChanges();
                logBufferCounter = 0;
            }

Now I have profiled the application and 74% of the work is being done by the Function: System.Data.Enitity.DbSet'1[System._Canon].Add 现在,我已经对该应用程序进行了System.Data.Enitity.DbSet'1[System._Canon].Add分析,并且该功能的74%的工作正在由System.Data.Enitity.DbSet'1[System._Canon].Add

Is there a better way to do the insert? 有更好的插入方法吗? Maybe queue up the tableItems into a List and then add the whole list to the DB Context. 也许将tableItems排队到一个列表中,然后将整个列表添加到数据库上下文中。

Or maybe Im looking at it all wrong and I should totally avoid using EntityFramework for this higher performance insert? 或者,也许我看错了,我应该完全避免对更高性能的插入使用EntityFramework? Currently it is the bottle neck in my application and if I look at the system resources SQL doesn't even seem to be budging an eyelid. 当前,这是我应用程序中的瓶颈,如果我查看系统资源,SQL甚至似乎都没有让您眼前一亮。

So my Questions: 所以我的问题:

1: In what way would I achieve the most efficient / quickest insert on multiple inserts 1:以哪种方式可以在多次插入中获得最有效/最快的插入

2: If EF is acceptable, how can I improve my solution? 2:如果EF可以接受,如何改善解决方案?

I am using SQL Server 2012 enterprise Edition, The incoming data is a constant stream, however I can afford to buffer it and then do A bulk insert if this is a better solution. 我正在使用SQL Server 2012企业版,传入的数据是一个恒定的流,但是我可以负担得起对其进行缓冲,然后执行批量插入(如果这是一个更好的解决方案)。

[EDIT] [编辑]

To further explain the scenario. 进一步说明情况。 I have a thread which is looping on a concurrentQueue which dequeues the items from this queue. 我有一个线程在并发队列上循环,该队列从该队列中取出项目。 However due to the fact that the db insert is the bottle neck. 但是,由于db插入是瓶颈。 there are often thousands of entries in the queue, So if there is also an Async or Parallel way in which I could possibly make use of more than one thread to do the insert. 队列中通常有成千上万个条目,因此,如果还有一种异步或并行方式,我可能可以利用多个线程来进行插入。

For scenarios that involve large amounts of inserts, I tend to favor "buffer seperately" (in-memory, or a redis list, or whatever), then as a batch job (perhaps every minute, or every few minutes) read the list and use SqlBulkCopy to throw the data into the database as efficiently as possible. 对于涉及大量插入的场景,我倾向于使用“单独缓冲”(内存中或Redis列表等),然后作为批处理作业(可能每分钟或每隔几分钟)读取该列表并使用SqlBulkCopy将数据尽可能有效地扔到数据库中。 To help with that, I use the ObjectReader.Create method of fastmember , which exposes a List<T> (or any IEnumerable<T> ) as an IDataReader that can be fed into SqlBulkCopy , exposing properties of T as logical columns in the data-reader. 为了解决这个问题,我使用fastmemberObjectReader.Create方法,该方法将List<T> (或任何IEnumerable<T> )作为IDataReader公开,可以将其提供给SqlBulkCopy ,将T属性公开为数据中的逻辑列。 -读者。 All you need to do, then, is fill the List<T> from the buffer. 然后,您要做的就是从缓冲区填充List<T>

Note, however, that you need to think about the "something goes wrong" scenario; 但是请注意,您需要考虑“出问题了”的情况。 ie if the insert fails half way through, what do you do about the data in the buffer? 也就是说,如果插入失败了一半,您如何处理缓冲区中的数据? One option here is to do the SqlBulkCopy into a staging table (same schema, but not the "live" table), then use a regular INSERT to copy the data in one step when you know it is at the database - this makes recovery simpler. 这里的一个选项是将SqlBulkCopy放入一个临时表(相同的架构,但不是“活动”表),然后在您知道数据在数据库中时使用常规INSERT一步复制数据-这使恢复更加简单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM