简体   繁体   中英

what is the technique to process 1 million records from a database

This is more or less a design question. we have to process like a 1 million rows and send an xml to a third party. Initially we have to send like 1 million records, later we will send the deltas only.

Right now the stored procedure is taking approchimately around 15 to 20 min to return the data. Its a consoleapp right. I know its not a good way to get 1 million records at time.

I want to know the following things 1) Is console app in c# which connects to database is right approach or not 2) Are there any other ways of doing this?

Appreciate your guidence on this , there is no need for any coding or so , We need some some advice on how to proceed. Thanks in advance.

My thoughts:

  • don't fetch all the data then process it; but process it as it arrives - via IDataReader or LINQ
  • use equally streaming approach for the file; perhaps XmlWriter directly, or maybe XStreamingElement - in either case reading from the source above

this vastly reduces the amount of memory you need, and allows your machine to do something useful while waiting on the network IO

Re 1: Depends on your architecture. That simple. It is a VIABLE approach.

Re 2: Yes, tons. All vaible. You could make a system service handling data generation upon request. You could have a web application.

In general, a console app will work fine, and 1 million rows nia a result set are no exactly a lot either. Totally workable.

1-20 minuts is odd, though. Where is the time spent? 1 million rows to transfer and write out shoul not take more than 2-3 minutes.

1) Yeah, why not.

2) Yes.

Use cursors.

You will need to be a little more specific on what you are doing during the 15 to 20 minutes.

You are asking about the "right" way to do things - what are you optimising for?

Speed? A 15 - 20 minute stored proc sounds dangerous. What is it doing?

Maintenance / Readability? A console app will work. It would also be easier to test (unit testing etc) than a stored proc.

I have never liked long running stored procedures because it's not easy to see progress. At least with a console app you can output something

Trust me, 1 million records isn't a big deal to a famous commercial data base, it's not worth 15 to 20 min to return the records. Somewhere else is wrong! Are you building the XML file in the store procedure? If yes, please remove them and implement the XML building in C#. The SP has only one simple task: fetching data. It won't take a long time if you are not joining 1 million records on another 1 million records. After the data come into the application(console application is ok in this case), build the XML with maybe LINQ-to-XML. If you are still not satisfy with the performance, make your code parallel .

EDIT Your SP is time consuming, you need to optimize it. An example: In the SP T_Data with 1m records joins T_User with 1m records that costs a lot of time. After optimization: In the SP T_Data joins one record in T_User(almost a WHERE expression which is very fast), and in C# code you are getting the records from T_User, for each record, call the SP and get the data, then build one section /*piece* of your XML. All of them can be processed concurrently. At last, you merge all the pieces of XML into one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM