简体   繁体   中英

What is the proper way to insert millions of records from a csv file into a SQL Server database?

I have to insert 40 million records from a .csv file into a database and below is the process I followed.

  • Windows Service 1:

    • Reading the CSV
    • validating the records
    • inserting valid records into success table (intermediate table) using SqlBulkCopy .
  • Windows Service 2.

    • Getting 10 000 records from success table at a time
    • running a foreach for these 10 000
    • sending each record to database to insert into the main table.

Windows Service 1 takes about 30-40 Min but windows service 2 takes about 5 Hours to complete the task (minimum time). I have 2 ways to do this but cannot decide which is better and open for suggestions.

  1. Creating 4 separate windows service and processing 40000 records simultaneously
  2. Using a job watch we can use while loop
  3. Calling procedure async from windows service

My biggest doubt here is that we are using transactions in procedure and will async work with it because as per my assumption using transaction locks the table and other process needs to work on it.

I think your using the wrong tool for this job. c# apps might do the trick but there is a much more powerful way to do this using integration services.

I am guessing here, but these tables in the middle are to transform or check, or maybe to batch the import down?

SSIS can do all of these things using it log limit and SQL Bulk Import tools. I currently do hospital data imports which is around 8,000,000 records each night and it takes me a matter of minutes not hours to do.

A good read too around how SQL deals with such large data inputs is this article

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM