简体   繁体   English

使用Linq到Sql的插入过程非常慢

[英]Very slow insert process using Linq to Sql

I'm inserting large number of records using LinqToSql from C# to SqlServer 2008 express DB. 我正在使用LinqToSql从C#向SqlServer 2008 Express DB插入大量记录。 It looks like the insertion is very slow in this. 看起来插入速度非常慢。 Following is the code snippet. 以下是代码段。

public void InsertData(int id)
{

  MyDataContext dc = new MyDataContext();

  List<Item> result = GetItems(id);

  foreach (var item in result)
  {
    DbItem dbItem = new DbItem(){ItemNo = item.No, ItemName=item.Name};
    dc.Items.InsertOnSubmit();
  }

  dc.SubmitChanges();
}

Am I doing anything wrong? 我做错了吗? Or using Linq to insert large number of records is a bad choice? 或者使用Linq插入大量记录是一个糟糕的选择?

Update: Thanks for all the answers. 更新:感谢您的所有答案。 @p.campbell: Sorry for the records count, it was a typo, actually it is around 100000. Records also range till 200k as well. @ p.campbell:对于记录计数很抱歉,这是一个错字,实际上是100000左右。记录也一直到200k。

As per all the suggestions I moved this operation into parts (also a requirement change and design decision) and retrieving data in small chunks and inserting them into database as and when it comes. 根据所有建议,我将此操作移动到部分(也是需求更改和设计决策),并以小块的形式检索数据,并在它到来时将它们插入到数据库中。 I've put this InsertData() method in thread operation and now using SmartThreadPool for creating a pool of 25 threads to do the same operation. 我已将此InsertData()方法放在线程操作中,现在使用SmartThreadPool创建一个包含25个线程的池来执行相同的操作。 In this scenario I'm inserting at a time only 100 records. 在这种情况下,我一次只插入100条记录。 Now, when I tried this with Linq or sql query it didn't make any difference in terms of time taken. 现在,当我尝试使用Linq或SQL查询时,它在时间方面没有任何区别。

As per my requirement this operation is scheduled to run every hour and fetches records for around 4k-6k users. 根据我的要求,此操作计划每小时运行一次,并为大约4k-6k用户提取记录。 So, now I'm pooling every user data (retrieving and inserting into DB) operation as one task and assigned to one thread. 所以,现在我将每个用户数据(检索和插入到数据库)操作汇集为一个任务并分配给一个线程。 Now this entire process takes around 45 minutes for around 250k records. 现在整个过程大约需要45分钟,大约有250,000条记录。

Is there any better way to do this kind of task? 有没有更好的方法来完成这种任务? Or can anyone suggest me how can I improve this process? 或者任何人都可以建议我如何改进这个过程?

For inserting massive amount of data into SQL in a oner 用于在oner中将大量数据插入SQL中

Linq or SqlCommand, neither are designed for bulk copying data into SQL . Linq或SqlCommand, 都不是为将数据批量复制到SQL而设计的

You can use the SqlBulkCopy class which provides managed access to the bcp utility for bulk loading data into Sql from pretty much any data source. 您可以使用SqlBulkCopy类 ,它提供对bcp实用程序的托管访问,以便从几乎任何数据源批量加载数据到Sql。

The SqlBulkCopy class can be used to write data only to SQL Server tables. SqlBulkCopy类可用于仅将数据写入SQL Server表。 However, the data source is not limited to SQL Server; 但是,数据源不仅限于SQL Server; any data source can be used, as long as the data can be loaded to a DataTable instance or read with a IDataReader instance. 只要数据可以加载到DataTable实例或使用IDataReader实例读取,就可以使用任何数据源。

Performance comparison 性能比较

SqlBulkCopy is by far the fastest, even when loading data from a simple CSV file. 即使从简单的CSV文件加载数据,SqlBulkCopy也是最快的。

Linq will just generate a load of Insert statements in SQL and send them to your SQL Server. Linq将在SQL中生成一堆Insert语句并将它们发送到SQL Server。 This is no different than you using Ad-hoc queries with SqlCommand . 这与使用SqlCommand Ad-hoc查询没什么不同。 Performance of SqlCommand vs. Linq is virtually identical. SqlCommand与Linq的性能几乎完全相同。

The Proof 证据

(SQL Express 2008, .Net 4.0) (SQL Express 2008,.Net 4.0)

SqlBulkCopy SqlBulkCopy的

Using SqlBulkCopy to load 100000 rows from a CSV file (including loading the data) 使用SqlBulkCopy从CSV文件加载100000行(包括加载数据)

using (SqlConnection conn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=EffectCatalogue;Data Source=.\\SQLEXPRESS;"))
{
    conn.Open();
    Stopwatch watch = Stopwatch.StartNew();

    string csvConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\\data\\;Extended Properties='text;'";
    OleDbDataAdapter oleda = new OleDbDataAdapter("SELECT * FROM [test.csv]", csvConnString);
    DataTable dt = new DataTable();
    oleda.Fill(dt);

    using (SqlBulkCopy copy = new SqlBulkCopy(conn))
    {
        copy.ColumnMappings.Add(0, 1);
        copy.ColumnMappings.Add(1, 2);
        copy.DestinationTableName = "dbo.Users";
        copy.WriteToServer(dt);
    }
    Console.WriteLine("SqlBulkCopy: {0}", watch.Elapsed);
}

SqlCommand 的SqlCommand

using (SqlConnection conn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=TestDb;Data Source=.\\SQLEXPRESS;"))
{
    conn.Open();
    Stopwatch watch = Stopwatch.StartNew();
    SqlCommand comm = new SqlCommand("INSERT INTO Users (UserName, [Password]) VALUES ('Simon', 'Password')", conn);
    for (int i = 0; i < 100000; i++)
    {
        comm.ExecuteNonQuery();
    }
    Console.WriteLine("SqlCommand: {0}", watch.Elapsed);
}

LinqToSql LinqToSql

using (SqlConnection conn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=TestDb;Data Source=.\\SQLEXPRESS;"))
{
    conn.Open();
    Stopwatch watch = Stopwatch.StartNew();
    EffectCatalogueDataContext db = new EffectCatalogueDataContext(conn);
    for (int i = 0; i < 100000; i++)
    {
        User u = new User();
        u.UserName = "Simon";
        u.Password = "Password";
        db.Users.InsertOnSubmit(u);
    }
    db.SubmitChanges();
    Console.WriteLine("Linq: {0}", watch.Elapsed);
}

Results 结果

SqlBulkCopy: 00:00:02.90704339
SqlCommand: 00:00:50.4230604
Linq: 00:00:48.7702995

if you are inserting large record of data you can try with BULK INSERT . 如果要插入大量数据,可以尝试使用BULK INSERT

As per my knowledge there is no equivalent of bulk insert in Linq to SQL. 据我所知,在Linq to SQL中没有等效的批量插入。

You've got the SubmitChanges() being called once, which is good. 你已经调用了SubmitChanges()一次,这很好。 This means that only one connection and transaction are being used. 这意味着只使用一个连接和事务。

Consider refactoring your code to use InsertAllOnSubmit() instead. 请考虑重构代码以使用InsertAllOnSubmit()

List<dbItem> newItems = GetItems(id).Select(x=> new DbItem{ItemNo = x.No,
                                                           ItemName=x.Name})
                                    .ToList();
db.InsertAllOnSubmit(newItems);
dc.SubmitChanges();

The INSERT statements are sent one-by-one as previous, but perhaps this might be more readable? INSERT语句像以前一样逐个发送,但也许这可能更具可读性?

Some other things to ask/consider: 其他一些要问/考虑的事情:

  • What's the state of the indexes on the target table? 目标表上的索引状态是什么? Too many will slow down the writes. 太多会减慢写入速度。 * Is the database in Simple or Full recovery model? *数据库是简单还是完全恢复模式?
  • Capture the SQL statements going across the wire. 捕获通过网络传输的SQL语句。 Replay those statements in an adhoc query against your SQL Server database. 在针对SQL Server数据库的adhoc查询中重播这些语句。 I realize you're using SQL Express, and likely don't have SQL Profiler. 我意识到你正在使用SQL Express,并且可能没有SQL Profiler。 Use context.Log = Console.Out; 使用context.Log = Console.Out; to output your LINQ To SQL statements to the console . LINQ To SQL语句输出到控制台 Prefer SQL Profiler for convenience though. 不过,为了方便,首选SQL Profiler。
  • Do the captured SQL statements perform the same as your client code? 捕获的SQL语句执行与客户端代码相同的操作吗? If so, then the perf problem is at the database side. 如果是这样,则性能问题出现在数据库端。

Here's a nice walk-through of how to add a Bulk-Insert class to your application, which hugely improves the performance of inserting records using LINQ. 以下是如何向应用程序添加Bulk-Insert类的一个很好的演练,这极大地提高了使用LINQ插入记录的性能。

(All source code is provided, ready to be added to your own application.) (提供所有源代码,随时可以添加到您自己的应用程序中。)

http://www.mikesknowledgebase.com/pages/LINQ/InsertAndDeletes.htm http://www.mikesknowledgebase.com/pages/LINQ/InsertAndDeletes.htm

You would just need to make three changes to your code, and link in the class provided. 您只需要对代码进行三次更改,并在提供的类中进行链接。 Good luck ! 祝好运 !

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM