[英]What would be the fastest way to insert 2.5 million rows of data parsed from a text file, into Sql server
As the question stated I have a text file (700 mb) that I am reading using c#,I am parsing the 2.5 million lines, converting each line into a class, serializing the class, then inserting into a sql-server-2012 database. 正如问题所述,我有一个文本文件(700 mb),正在使用c#阅读,我正在解析250万行,将每一行转换为一个类,序列化该类,然后插入sql-server-2012数据库。
The table I am inserting into has two columns and looks like: 我要插入的表有两列,看起来像:
{Auto_Increment_id: Serialized Byte Array}
My current strategy is to parse about 10000 lines, insert them into the database, and then repeat. 我当前的策略是解析大约10000行,将它们插入数据库,然后重复。 This is taking about 3 hours to do, so I am sure there is a more efficient way.
这大约需要3个小时才能完成,因此我相信有一种更有效的方法。
One thought I had would be to write the inserts to a text file and do a bulk copy into the database. 我曾经想到的是将插入内容写入文本文件,然后将其批量复制到数据库中。 Any other thoughts?
还有其他想法吗?
Ultimately I want to get this process down to at least 10 - 20 minutes. 最终,我希望将此过程至少缩短到10-20分钟。 Is this possible?
这可能吗?
SqlBulkCopy. SqlBulkCopy。 Read about it.
阅读有关它。 IN the documentation.
在文档中。
FASTER - because it is not really written smart - is to make this into a temp table, then at the end of that insert into the final table. 更快-因为它实际上并不是很聪明-将其放入临时表中,然后在该表的末尾插入最终表中。 SqlBulkCopy locks the whole table, this bypasses it and allows the table to be used during the upload.
SqlBulkCopy锁定整个表,这会绕过整个表,并允许在上载期间使用该表。
Then use multiple threads to insert blocks of a lot more than 10000 rows per go. 然后使用多个线程来插入每行超过10000行的块。
I manage more than 100.000 rows - per second - on a lower end database server (that is 48gb memory, about a dozen SAS discs - and yes, that is lower end). 我在低端数据库服务器(每秒48GB内存,大约十几个SAS磁盘-是,低端)上每秒管理100.000行以上。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.