简体   繁体   中英

What is the easiest (and fastest) way to update 20K employee data records from a CSV data dump in SQL Server?

I have an "employees" table with 50k+ records. We only have 24k employees but some of the employees that are no longer here are tied to historical projects so I don't want to delete them. And, of course, we've hired more employees that are working on NEW projects so they need to be added to the employees table.

I managed to convince HR to give me a CSV file with the employee data we keep in our table and now I need a way to update the existing records (new phone numbers, departments, etc...) and add new.

There are 3 criteria:

  1. if the record exists in the CSV and the "employees" table, UPDATE the data;
  2. if the record exists in the CSV and NOT the "employees" table, INSERT the data;
  3. if the record exists in the "employees" table and NOT the CSV, set the record to "inactive."

This will be a regular (monthly) process so a Stored Procedure or a Function would be doable.

Suggestions please...

UPDATE: The MERGE idea works but only solves 2/3 of the problem (it does not meet criteria #3 because I do not want to delete the record if the employee is not longer with the company). When adding the 2nd UPDATE statement after the NOT MATCHED BY SOURCE, it returns an error indicating I cannot update the same record twice.

Any suggestions to this final piece of the puzzle?

What about using 'merge'?

MERGE target_table USING source_table
ON merge_condition
WHEN MATCHED
    THEN update_statement
WHEN NOT MATCHED
    THEN insert_statement
WHEN NOT MATCHED BY SOURCE
    THEN DELETE;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM