简体   繁体   English

限制行数-TSQL-合并-SQL Server 2008

[英]Limit number of rows - TSQL - Merge - SQL Server 2008

Hi all i have the following merge sql script which works fine for a relatively small number of rows (up to about 20,000 i've found). 大家好,我有以下合并sql脚本,对于相对较少的行(我发现多达约20,000个)可以很好地工作。 However sometimes the data i have in Table B can be up to 100,000 rows and trying to merge this with Table A (which is currently at 60 million rows). 但是有时我在表B中拥有的数据可能多达100,000行,并尝试将其与表A合并(当前为6000万行)。 This takes quite a while to process, which is understandable as it has to merge 100,000 with 60 million existing records! 这需要花费相当长的时间才能处理,这是可以理解的,因为它必须将100,000条记录与6000万条现有记录合并!

I was just wondering if there was a better way to do this. 我只是想知道是否有更好的方法可以做到这一点。 Or is it possible to have some sort of count, so merge 20,000 rows from Table B to Table A. Then delete those merged rows from table B. Then do the next 20,000 rows and so on, until Table B has no rows left? 还是可能会有某种计数,所以将表B中的20,000行合并到表A。然后从表B中删除那些合并的行。然后再进行下20,000行,依此类推,直到表B没有剩余的行?

Script: 脚本:

MERGE
    Table A AS [target]
USING
    Table B AS [source]
ON
    ([target].recordID = [source].recordID)
WHEN NOT MATCHED BY TARGET
    THEN
        INSERT([recordID],[Field 1]),[Field 2],[Field 3],[Field 4],[Field 5])
        VALUES([source].[recordID],[source].[Field 1],[source].[Field 2],[source].[Field 3],[source].[Field 4],[source].[Field 5]
    );

MERGE is overkill for this since all you want is to INSERT missing values. MERGE ,这是多余的,因为您想要的只是INSERT缺失的值。

Try: 尝试:

INSERT INTO Table_A
([recordID],[Field 1]),[Field 2],[Field 3],[Field 4],[Field 5])
SELECT  B.[recordID],
        B.[Field 1],B.[Field 2],B.[Field 3],B.[Field 4],B.[Field 5]
FROM Table_B as B
WHERE NOT EXISTS (SELECT 1 FROM Table_A A
                  WHERE A.RecordID = B.RecordID)

In my experience MERGE can perform worse for simple operations like this. 以我的经验, MERGE对于像这样的简单操作可能会表现更差。 I try to reserve it for when you need varying operations depending on conditions, like an UPSERT. 当您需要根据条件(例如UPSERT)进行各种操作时,我会尽量保留它。

You can definitely do (SELECT TOP 20000 * FROM B ORDER BY [some_column]) as [source] in USING and then delete these records after MERGE . 您可以肯定地在USING (SELECT TOP 20000 * FROM B ORDER BY [some_column]) as [source] ,然后在MERGE之后删除这些记录。 So you pseudo-code will look like : 因此您的伪代码将如下所示:

1. Merge top 20000
2. Delete 20000 records from source table
3. Check @@ROWCOUNT. If it's 0, exit; otherwise goto step 1

I'm not sure if it runs any faster than merging all the records at the same time. 我不确定它的运行速度是否比同时合并所有记录快。 Also, are you sure you need MERGE ? 另外,您确定需要MERGE吗? From what I see in your code INSERT INTO ... SELECT should also work for you. 根据我在您的代码中看到的,将INSERT INTO ... SELECT也适合您。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM