[英]How to insert/Update 10000 rows in SQL Server using C# efficiently while comparing each row from database
I have been given an Excel file from a customer.客户给了我一个 Excel 文件。 It has 4 columns: Id, name, place, date).
它有 4 列:ID、名称、地点、日期)。
I have a table in my database which stores these values.我的数据库中有一个存储这些值的表。 I have to check each row from Excel and compare its values to the database table.
我必须检查 Excel 中的每一行并将其值与数据库表进行比较。 If a row already exists, then compare the date and update to latest date from Excel.
如果一行已经存在,则比较日期并从 Excel 更新为最新日期。 If the row does not exist yet, insert a new row.
如果该行尚不存在,则插入一个新行。
I'm fetching each row and comparing its values using for a loop and updating database using insert/update statement by creating data table adapter.我正在获取每一行并使用for循环比较其值,并通过创建数据表适配器使用插入/更新语句更新数据库。
My problem is this operation is taking 4+ hours to update the data.我的问题是这个操作需要 4 个多小时来更新数据。 Is there any efficient way to do this?
有没有有效的方法来做到这一点? I have searched a lot and found options like
SqlBulkCopy
but how will I compare each and every row from database?我进行了很多搜索并找到了诸如
SqlBulkCopy
类的选项,但是我将如何比较数据库中的每一行?
I'm using ASP.NET with C# and SQL Server.我将 ASP.NET 与 C# 和 SQL 服务器一起使用。
Here's my code:这是我的代码:
for (var row = 2; row <= workSheet.Dimension.End.Row; row++)
{
// Get data from excel
var Id = workSheet.Cells[row, 1].Text;
var Name = workSheet.Cells[row, 2].Text;
var Place = workSheet.Cells[row, 3].Text;
var dateInExcel = workSheet.Cells[row, 4].Text;
// check in database if ID exists in database then compare date and update database>
if (ID.Rows.Count <= 0) //no row exist in database
{
// Insert row in the database using data table adapter's insert statement
}
else if (Id.Rows.Count > 0) //Id exists in database
{
if (Db.DateInDB < (dateUpdate)) // compare dates
{
// Update database with the new date using data table adapter Update statement.
}
}
}
@mjwills and @Dan Guzman make very valid points in the comments section. @mjwills 和 @Dan Guzman 在评论部分提出了非常有效的观点。
My suggestion would be to create an SSIS package to import the spreadsheet into a temp table then using a merge query/queries make conditional updates to the requires tables(s).我的建议是创建一个 SSIS package 将电子表格导入临时表,然后使用合并查询/查询对所需表进行条件更新。
https://docs.microsoft.com/en-us/sql/integration-services/import-export-data/start-the-sql-server-import-and-export-wizard?view=sql-server-ver15 https://docs.microsoft.com/en-us/sql/integration-services/import-export-data/start-the-sql-server-import-and-export-wizard?view=sql-server-ver15
The simplest way to get a good starting point is to use the import wizard in SSMS and save the resultant Package.获得良好起点的最简单方法是使用 SSMS 中的导入向导并保存生成的 Package。 Create an SSIS Project in Visual Studio (You will need the correct version of BI Installed, for the target SQL Server version)
在 Visual Studio 中创建一个 SSIS 项目(对于目标 SQL 服务器版本,您需要安装正确版本的 BI)
https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt?view=sql-server-ver15 https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt?view=sql-server-ver15
https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver15 https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver15
This approach would leverage SQL doing what it does best, dealing with relational data sets, and moves it out of the asp code.这种方法将利用 SQL 做它最擅长的事情,处理关系数据集,并将其移出 asp 代码。
To invoke this the ASP App would need to handle the initial file upload/whatever and then invoke the SSIS Package.要调用它,ASP 应用程序需要处理初始文件上传/任何内容,然后调用 SSIS Package。
This can be done by setting the SSIS Package as a Job on the SQL Server, with no schedule and then starting the job when you want it to run.这可以通过将 SSIS Package 设置为 SQL 服务器上的作业来完成,没有计划,然后在您希望它运行时启动作业。
How to execute an SSIS package from .NET? 如何从 .NET 执行 SSIS package?
There are most likely some optimisations that can be made to this approach;这种方法很可能有一些优化; but it should work in principal.
但它应该在原则上起作用。
Hope this helps:)希望这可以帮助:)
10_000 records taking more than 3x3600s suggests >1s per record - I think it should be possible to improve on that. 10_000 条记录超过 3x3600s 表明每条记录 >1s - 我认为应该可以改进这一点。
Doing the work in the database would result in best performance, but there are few things you can do prior.在数据库中进行工作会产生最佳性能,但您可以事先做的事情很少。
Check the basics:检查基础知识:
Use batches.使用批次。 You should be able to get a magnitude better performance if you do work in batches rather than one record at a time.
如果您分批工作而不是一次记录一条记录,您应该能够获得更好的性能。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.