简体   繁体   中英

Insert rows from TableA that are not in TableB using concatenated keys

I have 2 tables. TableA gets populated by a csv import and typically contains between 10k and 15k rows. TableB has the same structure, and has now grown to about 95k rows. In order to determine rows in TableA that are not in TableB, I need to compare a concatenation of 4 fields in TableA with the same concatenation in TableB.

The code below has been working as TableB has been growing, but is just taking so long that it needs to be cancelled and does not finish.

I strongly believe that the use of concatenated fields as a comparison is causing execution times to grow beyond usability.

Is there a better approach to the problem?

DELETE FROM billing..whse_Temp
BULK INSERT billing..whse_Temp
FROM '/mnt/ABC/ABC.csv'
WITH
(
  FORMAT='csv',
  FIRSTROW=2,
  FIELDTERMINATOR=',',
  ROWTERMINATOR='\r\n'
)

INSERT INTO billing..whse
SELECT * FROM billing..whse_Temp S
WHERE CONCAT(S.RunTimeStamp, S.CS_Datacenter,S.Customer, S.ServerName) NOT IN
  (
    SELECT CONCAT(RunTimeStamp, CS_Datacenter, Customer, ServerName) 
    FROM billing..whse
  )

Simply use NOT EXISTS :

INSERT INTO billing..whse
SELECT * FROM billing..whse_temp S
WHERE NOT EXISTS
(
  SELECT NULL
  FROM billing..whse w
  WHERE w.runtimestamp = s.runtimestamp
    AND w.cs_datacenter = s.cs_datacenter
    AND w.customer = s.customer
    AND w.servername = s.servername
);

The appropriate index for this:

CREATE INDEX idx ON billing..whse (runtimestamp, cs_datacenter, customer, servername);

I'm sure there are ways to do it with a MERGE command, but I've never really used those. I'm sure there's ways with EXISTS across multiple columns, but I personally find it clearer to have the full join condition and then just test for where the join failed. (ie: no row on the right side):

INSERT INTO billing..whse
SELECT S.* 
  FROM billing..whse_Temp S
       LEFT OUTER JOIN billing..whse W
         ON S.RunTimeStamp = W.RunTimeStamp
         AND S.CS_Datacenter = W.CS_Datacenter
         AND S.Customer = W.Customer
         AND S.ServerName = W.ServerName
WHERE W.RunTimeStamp IS NULL

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM