简体   繁体   English

合并语句插入而不是在SQL Server中更新

[英]Merge statement inserting instead of updating in SQL Server

I'm using SQL Server 2008 and I'm trying to load a new (target) table from a staging (source) table. 我正在使用SQL Server 2008,并且试图从暂存(源)表中加载新的(目标)表。 The target table is empty. 目标表为空。

I think since my target table is empty, the MERGE statement skips the WHEN MATCHED part ie result of INNER JOIN is NULL and so nothing is UPDATED, and it just proceed to the WHEN NOT MATCHED BY TARGET part (LEFT OUTER JOIN) an inserts all the records in the staging table. 我认为由于我的目标表为空,因此MERGE语句跳过WHEN MATCHED部分,即INNER JOIN的结果为NULL,因此没有任何更新,它只是继续进行到WHEN NOT MATCHED BY TARGET部分(LEFT OUTER JOIN),然后全部插入登台表中的记录。

My target table looks exactly similar to my staging table (rows #1 and #4). 我的目标表看起来与我的登台表(行1和行4)完全相似。 There should be only 3 rows in the target table (3 inserts and one update for row #4). 目标表中应该只有3行(第4行有3个插入和1个更新)。 So, I'm not sure whats going on. 因此,我不确定发生了什么。

FileID  client_id account_name  account_currency  creation_date last_modified
210     12345           Cars            USD       2013-11-21    2013-11-27 
211     23498           Truck           USD       2013-09-22    2013-11-27 
212     97652           Cars - 1        USD       2013-09-17    2013-11-27 
210     12345           Cars            JPY       2013-11-21    2013-11-29


QUERY QUERY

MERGE [AccountSettings] AS tgt -- RIGHT TABLE
USING
(
SELECT * FROM [AccountSettings_Staging]
) AS src -- LEFT TABLE
ON src.client_id = tgt.client_id
AND src.account_name = tgt.account_name
WHEN MATCHED -- INNER JOIN
    THEN UPDATE
       SET
         tgt.[FileID] = src.[FileID]
        ,tgt.[account_currency] = src.[account_currency]
        ,tgt.[creation_date] = src.[creation_date]
        ,tgt.[last_modified] = src.[last_modified]

WHEN NOT MATCHED BY TARGET  -- left outer join: A row from the source that has no corresponding row in the target
THEN INSERT
    (
        [FileID],   
        [client_id], 
        [account_name],
        [account_currency],
        [creation_date], 
        [last_modified] 
    )
    VALUES
    (
        src.[FileID],   
        src.[client_id], 
        src.[account_name],
        src.[account_currency], 
        src.[creation_date], 
        src.[last_modified]             
    );

Since the target table is empty, using MERGE seems to me like hiring a plumber to pour you a glass of water. 由于目标表是空的,因此在我看来,使用MERGE就像雇用水管工为您倒一杯水。 And MERGE operates only one branch, independently, for every row of a table - it can't see that the key is repeated and so perform an insert and then an update - this betrays that you think SQL always operates on a row-by-row basis, when in fact most operations are performed on the entire set at once. 而且MERGE对于表的每一行仅独立地运行一个分支-它看不到键是重复的,因此先执行插入操作然后进行更新-这表明您认为SQL总是对行进行操作,实际上,大多数操作是一次对整个集合执行的。

Why not just insert only the most recent row: 为什么不只插入最近的行:

;WITH cte AS 
(
  SELECT FileID, ... other columns ..., 
    rn = ROW_NUMBER() OVER (PARTITION BY FileID ORDER BY last_modified DESC)
  FROM dbo.AccountSettings_Staging
)
INSERT dbo.AccountSettings(FileID, ... other columns ...)
  SELECT FileID, ... other columns ...
  FROM cte WHERE rn = 1;

If you have potential for ties on the most recent last_modified , you'll need to find another tie-breaker (not obvious from your sample data). 如果您在最新的last_modified上可能具有平局,则需要找到另一个平局决胜者(从示例数据中不明显)。

For future versions, I would say run an UPDATE first: 对于将来的版本,我会说先运行UPDATE

UPDATE a SET client_id = s.client_id /* , other columns that can change */
  FROM dbo.AccountSettings AS a
  INNER JOIN dbo.AccountSettings_Staging AS s
  ON a.FileID = s.FileID;

(Of course, this will choose an arbitrary row if the source contains multiple rows with the same FileID - you may want to use a CTE here too to make the choice predictable.) (当然,如果源包含具有相同FileID多行,这将选择任意行-您可能也想在此处使用CTE来使选择可预测。)

Then add this clause to the INSERT CTE above: 然后将此子句添加到上面的INSERT CTE中:

FROM dbo.AccountSettings_Staging AS s
WHERE NOT EXISTS (SELECT 1 FROM dbo.AccountSettings 
  WHERE FileID = s.FileID);

Wrap it all in a transaction at the appropriate isolation level, and you are still avoiding a ton of complicated MERGE syntax, potential bugs, etc. 以适当的隔离级别将所有内容包装在事务中,并且您仍在避免大量复杂的MERGE语法,潜在的错误等。

I think since my target table is empty, the MERGE statement skips the WHEN MATCHED part 我认为由于目标表为空,因此MERGE语句会跳过WHEN MATCHED部分

Well, that's correct, but it's by design - MERGE is not a "progressive" merge. 是的,这是正确的,但这是设计MERGE不是“渐进式”合并。 It does not go row-by-row to see if records inserted as part of the MERGE should now be updated. 它不会逐行查看是否现在应该更新作为MERGE一部分插入的记录。 It processes the source in "batches" based on whether or not a match was found in the destination. 它根据是否在目标中找到匹配项,以“批次”方式处理源。

You'll need to deal with the "duplicate" records at the source before attempting the MERGE . 在尝试MERGE之前,您需要在源头处理“重复”记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM