简体   繁体   中英

BULK INSERT across multiple related tables?

I need to do a BULK INSERT of several hundred-thousand records across 3 tables. A simple breakdown of the tables would be:

TableA
--------
TableAID (PK)
TableBID (FK)
TableCID (FK)
Other Columns

TableB
--------
TableBID (PK)
Other Columns

TableC
--------
TableCID (PK)
Other Columns

The problem with a bulk insert, of course, is that it only works with one table so FK's become a problem.

I've been looking around for ways to work around this, and from what I've gleaned from various sources, using a SEQUENCE column might be the best bet. I just want to make sure I have correctly cobbled together the logic from the various threads and posts I've read on this. Let me know if I have the right idea.

First, would modify the tables to look like this:

TableA
--------
TableAID (PK)
TableBSequence
TableCSequence
Other Columns

TableB
--------
TableBID (PK)
TableBSequence
Other Columns

TableC
--------
TableCID (PK)
TableCSequence
Other Columns

Then, from within the application code, I would make five calls to the database with the following logic:

  • Request X Sequence numbers from TableC, where X is the known number of records to be inserted into TableC. (1st DB call.)

  • Request Y Sequence numbers from TableB, where Y is the known number of records to be inserted into TableB (2nd DB call.)

  • Modify the existing objects for A, B and C (which are models generated to mirror the tables) with the now known Sequence numbers.

  • Bulk insert to TableA. (3rd DB call)

  • Bulk insert to TableB. (4th DB call)
  • Bulk insert to TableC. (5th DB call)

And then, of course, we would always join on the Sequence.

I have three questions:

  1. Do I have the basic logic correct?

  2. In Tables B and C, would I remove the clustered index from the PK and put in on the Sequence instead?

  3. Once the Sequence numbers are requested from Tables B and C, are they then somehow locked between the request and the bulk insert? I just need to make sure that between the request and the insert, some other process doesn't request and use the same numbers.

Thanks!

EDIT:

After typing this up and posting it, I've been reading deeper into the SEQUENCE document. I think I misunderstood it at first. SEQUENCE is not a column type. For the actual column in the table, I would just use an INT (or maybe a BIGINT) depending on the number of records I expect to have). The actual SEQUENCE object is an entirely separate entity whose job is to generate numeric values on request and keep track of which ones have already been generated. So, if I understand correctly, I would generate two SEQUENCE objects, one to be used in conjunction with Table B and one with Table C.

So that answers my third question.

Do I have the basic logic correct?

Yes. The other common approach here is to bulk load your data into a staging table, and do something similar on the server-side.

From the client you can request ranges of sequence values using the sp_sequence_get_range stored procedure.

In Tables B and C, would I remove the clustered index from the PK

No, as you later noted the sequence just supplies the PK values for you.

Sorry, read your question wrong at first. I see now that you are trying to generate your own PK's rather then allow MS SQL to generate them for you. Scratch my above comment.

As David Browne mentioned, you might want to use a staging table to avoid the strain you'll put on your app's heap. Use tempdb and do the modifications directly on the table using a single transaction for each table. Then, copy the staging tables over to their target or use a MERGE if appending. If you are enforcing FK's, you can temporarily remove those constraints if you choose to insert in reverse order (C=>B=>A). You also may want to consider temporarily removing indexes if experiencing performance issues during the insert. Last, consider using SSIS instead of a custom app.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM