简体   繁体   中英

Merge statement inserting instead of updating in SQL Server

I'm using SQL Server 2008 and I'm trying to load a new (target) table from a staging (source) table. The target table is empty.

I think since my target table is empty, the MERGE statement skips the WHEN MATCHED part ie result of INNER JOIN is NULL and so nothing is UPDATED, and it just proceed to the WHEN NOT MATCHED BY TARGET part (LEFT OUTER JOIN) an inserts all the records in the staging table.

My target table looks exactly similar to my staging table (rows #1 and #4). There should be only 3 rows in the target table (3 inserts and one update for row #4). So, I'm not sure whats going on.

FileID  client_id account_name  account_currency  creation_date last_modified
210     12345           Cars            USD       2013-11-21    2013-11-27 
211     23498           Truck           USD       2013-09-22    2013-11-27 
212     97652           Cars - 1        USD       2013-09-17    2013-11-27 
210     12345           Cars            JPY       2013-11-21    2013-11-29


MERGE [AccountSettings] AS tgt -- RIGHT TABLE
USING
(
SELECT * FROM [AccountSettings_Staging]
) AS src -- LEFT TABLE
ON src.client_id = tgt.client_id
AND src.account_name = tgt.account_name
WHEN MATCHED -- INNER JOIN
    THEN UPDATE
       SET
         tgt.[FileID] = src.[FileID]
        ,tgt.[account_currency] = src.[account_currency]
        ,tgt.[creation_date] = src.[creation_date]
        ,tgt.[last_modified] = src.[last_modified]

WHEN NOT MATCHED BY TARGET  -- left outer join: A row from the source that has no corresponding row in the target
THEN INSERT
    (
        [FileID],   
        [client_id], 
        [account_name],
        [account_currency],
        [creation_date], 
        [last_modified] 
    )
    VALUES
    (
        src.[FileID],   
        src.[client_id], 
        src.[account_name],
        src.[account_currency], 
        src.[creation_date], 
        src.[last_modified]             
    );

Since the target table is empty, using MERGE seems to me like hiring a plumber to pour you a glass of water. And MERGE operates only one branch, independently, for every row of a table - it can't see that the key is repeated and so perform an insert and then an update - this betrays that you think SQL always operates on a row-by-row basis, when in fact most operations are performed on the entire set at once.

Why not just insert only the most recent row:

;WITH cte AS 
(
  SELECT FileID, ... other columns ..., 
    rn = ROW_NUMBER() OVER (PARTITION BY FileID ORDER BY last_modified DESC)
  FROM dbo.AccountSettings_Staging
)
INSERT dbo.AccountSettings(FileID, ... other columns ...)
  SELECT FileID, ... other columns ...
  FROM cte WHERE rn = 1;

If you have potential for ties on the most recent last_modified , you'll need to find another tie-breaker (not obvious from your sample data).

For future versions, I would say run an UPDATE first:

UPDATE a SET client_id = s.client_id /* , other columns that can change */
  FROM dbo.AccountSettings AS a
  INNER JOIN dbo.AccountSettings_Staging AS s
  ON a.FileID = s.FileID;

(Of course, this will choose an arbitrary row if the source contains multiple rows with the same FileID - you may want to use a CTE here too to make the choice predictable.)

Then add this clause to the INSERT CTE above:

FROM dbo.AccountSettings_Staging AS s
WHERE NOT EXISTS (SELECT 1 FROM dbo.AccountSettings 
  WHERE FileID = s.FileID);

Wrap it all in a transaction at the appropriate isolation level, and you are still avoiding a ton of complicated MERGE syntax, potential bugs, etc.

I think since my target table is empty, the MERGE statement skips the WHEN MATCHED part

Well, that's correct, but it's by design - MERGE is not a "progressive" merge. It does not go row-by-row to see if records inserted as part of the MERGE should now be updated. It processes the source in "batches" based on whether or not a match was found in the destination.

You'll need to deal with the "duplicate" records at the source before attempting the MERGE .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM