简体   繁体   中英

C# and SQL Server 2008 - Batch Update

I have a requirement where I need to update thousands of records in live database table, and although there are many columns in this table, I only need to update 2-3 columns.

Further I can't hit database for thousand times just for updating which can be done in a batch update using SQL Server Table Valued Parameter. But again I shouldn't update all thousands records in one go for better error handling, instead want to update records in batches of x*100.

So, below is my approach, please give your valuable inputs for any other alternatives or any change in the proposed process -

  • 1 Fetch required records from database to List<T> MainCollection
  • 2 Save this collection to XML file with each element Status = Pending
  • 3 Take first 'n' elements from XML file with Status = Pending and add them to new List<T> SubsetCollection
  • 4 Loop over List<T> SubsetCollection - make required changes to T
  • 5 Convert List<T> SubsetCollection to DataTable
  • 6 Call Update Stored Procedure and pass above DataTable as TVP
  • 7 Update Status = Processed for XML Elements corresponding to List<T> SubsetCollection
  • 8 If more records with Pending status exists in XML file, go to Step# 3.

Please guide for a better approach or any enhancement in above process.

I would do a database-only approach if possible and if not possible, eliminate the parts that will be the slowest. If you are unable to do all the work in a stored procedure, then retrieve all the records and make changes.

The next step is to write the changes to a staging table with SQL Bulk Copy. This is a fast bulk loaded that will copy thousands of records in seconds. You will store the primary key and the columns to be updated as well as a batch number. The batch number is assigned to each batch of records, therefore allowing another batch to be loaded without conflicting with the first batch.

Use a stored procedure on the server to process the records in batches of 100 or 1000 depending on performance. Pass the batch number to the stored procedure.

We use such a method to load and update millions of records in batches. The best speed is obtained by eliminating the network and allowing the database server to handle the bulk of the work.

I hope this might provide you with an alternate solution to evaluate.

It may not be the best practice but you could embed some logic inside a SQL Server CLR function. This function could be called by a Query,StoProc or a schedule to run at a certain time.

The only issue I can see is getting step 4 to make the required changes on T . Embedding that logic into the database could be detrimental to maintenance, but this is no different to people who embed massive amounts of business logic into StoProcs.

Either way SQL Server CLR functions may be the way to go. You can create them in Visual Studio 2008, 2010 (Check the database new project types).

Tutorial : http://msdn.microsoft.com/en-us/library/w2kae45k(v=vs.80).aspx

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM