简体   繁体   中英

How do I get a records for a table that do not exist in another table?

I Have a staging table which gets updated as part of scheduled batch script.

Lets call this table Staging_Table

Now on a daily basis I update a tabled called Product_Table with entries from Staging_Table. I need to delete the rows in Products_Table which do not have entries from the Staging table.

Now to uncomplicate things the staging table hold around 97000 records while the producst table has only 7000. However ona daily basis the entries in the staging table go up by 97000. i have a key for these products called TDC_IDP_ID....

So i have this query which seems to be taking forever to execute...

DELETE FROM Product_Table
        WHERE PRODUCT_TD_PARTCODE NOT IN ( SELECT TDC_TD_PARTCODE FROM Staging_Table WHERE TDC_IDP_ID = @TDC_IDP_ID )

Now the inner query has 97000 records. How can i optimise this query(to atleast run) or is there another way to go about this? Instead of select i tried the following query and it is still running as i type this question. Its been 11 minutes that it is running....

SELECT COUNT(*)
        FROM   Product_Table
        WHERE PRODUCT_TD_PARTCODE NOT IN ( SELECT TDC_TD_PARTCODE FROM Staging_Table WHERE TDC_IDP_ID = @TDC_IDP_ID )

Use LEFT JOIN instead of NOT IN

Try this:

SELECT COUNT(*)
FROM Product_Table PT 
LEFT OUTER JOIN Staging_Table ST ON PT.PRODUCT_TD_PARTCODE = ST.TDC_TD_PARTCODE AND ST.TDC_IDP_ID = @TDC_IDP_ID 
WHERE ST.TDC_TD_PARTCODE IS NULL 

DELETE PT 
FROM Product_Table PT 
LEFT OUTER JOIN Staging_Table ST ON PT.PRODUCT_TD_PARTCODE = ST.TDC_TD_PARTCODE AND ST.TDC_IDP_ID = @TDC_IDP_ID 
WHERE ST.TDC_TD_PARTCODE IS NULL 

First, rephrase the index as not exists :

DELETE FROM Product_Table
    WHERE NOT EXISTS (SELECT 1
                      FROM Staging_Table st
                      WHERE st.TDC_IDP_ID = @TDC_IDP_ID  AND
                            st.TDC_TD_PARTCODE = product_table.PRODUCT_TD_PARTCODE
                     );

Then you want an index on the staging table:

create index idx_Staging_Table_2 on Staging_Table(TDC_TD_PARTCODE, TDC_IDP_ID);

For these heavy data you must use LEFT JOIN, and the other thing 'IN/ NOT IN' will make you query so heavy to execute and the run time will be more. Use of join will give you faster execution. In your case use left join

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM