简体   繁体   English

在插入期间避免重复

[英]Avoiding duplicates during insert

I am working on a stored procedure that currently builds our fact table every hour.我正在研究一个存储过程,该过程当前每小时构建我们的事实表。 Currently, during hourly refresh it truncates the table and Inserts new data every time.目前,在每小时刷新期间,它会截断表并每次插入新数据。 I am trying to change that to only delete rows that are not needed and append new rows.我正在尝试将其更改为仅删除不需要的行和 append 新行。 I have gotten the delete part, but currently, as the ID column (Primary Key) is created upon Insertion, I am not sure how to avoid the insertion of duplicate records, which is what I am currently seeing.我已经得到了删除部分,但是目前,由于 ID 列(主键)是在插入时创建的,我不确定如何避免插入重复记录,这是我目前看到的。

Currently, the stored procedure inserts the primary key (ID) upon insert.目前,存储过程在插入时插入主键 (ID)。 I've taken out the truncate table query and replaced that with a delete query.我取出了截断表查询并用删除查询替换了它。 Now I need to work on avoiding duplicates during the insert.现在我需要在插入过程中避免重复。

   --INSERT DATA FROM TEMP TABLE TO FACTBP
   INSERT INTO dbo.FactBP
   SELECT 
   [SOURCE]
  ,[DC_ORDER_NUMBER]
  ,[CUSTOMER_PURCHASE_ORDER_ID]
  ,[BILL_TO]
  ,[CUSTOMER_MASTER_RECORD_TYPE]
  ,[SHIP_TO]
  ,[CUSTOMER_NAME]
  ,[SALES_ORDER]
  ,[ORDER_CARRIER]
  ,[CARRIER_SERVICE_ID]
  ,[CREATE_DATE]
  ,[CREATE_TIME]
  ,[ALLOCATION_DATE]
  ,[REQUESTED_SHIP_DATE]
  ,[ADJ_REQ_SHIP]
  ,[CANCEL_DATE]
  ,[DISPATCH_DATE]
  ,[RELEASED_DATE]
  ,[RELEASED_TIME]
  ,[PRIORITY_ORDER]
  ,[SHIPPING_LOAD_NUMBER]
  ,[ORDER_HDR_STATUS]
  ,[ORDER_STATUS]
  ,[DELIVERY_NUMBER]
  ,[DCMS_ORDER_TYPE]
  ,[ORDER_TYPE]
  ,[MATERIAL]
  ,[QUALITY]
  ,[MERCHANDISE_SIZE_1]
  ,[SPECIAL_PROCESS_CODE_1]
  ,[SPECIAL_PROCESS_CODE_2]
  ,[SPECIAL_PROCESS_CODE_3]
  ,[DIVISION]
  ,[DIVISION_DESC]
  ,[ORDER_QTY]
  ,[ORDER_SELECTED_QTY]
  ,[CARTON_PARCEL_ID]
  ,[CARTON_ID]
  ,[SHIP_DATE]
  ,[SHIP_TIME]
  ,[PACKED_DATE]
  ,[PACKED_TIME]
  ,[ADJ_PACKED_DATE]
  ,[FULL_CASE_PULL_STATUS]
  ,[CARRIER_ID]
  ,[TRAILER_ID]
  ,[WAVE_NUMBER]
  ,[DISPATCH_RELEASE_PRIORITY]
  ,[CARTON_TOTE_COUNT]
  ,[PICK_PACK_METHOD]
  ,[RELEASED_QTY]
  ,[SHIP_QTY]
  ,[MERCHANDISE_STYLE]
  ,[PICK_WAREHOUSE]
  ,[PICK_AREA]
  ,[PICK_ZONE]
  ,[PICK_AISLE]
  ,EST_DEL_DATE
  ,null
  --,[ID]
  FROM #TEMP_FACT
  --code for avoiding duplicates

   --CLEAR ALL DATA FROM FACTBP
   DELETE FROM dbo.FactBP
   WHERE SHIP_DATE < DATEADD(s,-1,DATEADD(mm, 
   DATEDIFF(m,0,GETDATE())-2,0)) and SHIP_DATE IS NOT NULL

You need to check against the natural key .您需要检查natural key Since you're talking about a fact table, the natural key is probably the combination of a lot of fields.由于您在谈论事实表,因此自然键可能是许多字段的组合。 If we assume SOURCE and DC_ORDER_NUMBER make up the natural key, this should work:如果我们假设 SOURCE 和 DC_ORDER_NUMBER 构成自然键,这应该有效:

INSERT INTO dbo.FactBP

SELECT 
  t.[SOURCE]
, t.[DC_ORDER_NUMBER]
, t.[CUSTOMER_PURCHASE_ORDER_ID]
, t.[BILL_TO]
, t.[CUSTOMER_MASTER_RECORD_TYPE]
, t.[SHIP_TO]
, t.[CUSTOMER_NAME]
, t.[SALES_ORDER]
, t.[ORDER_CARRIER]
, t.[CARRIER_SERVICE_ID]
, t.[CREATE_DATE]
, t.[CREATE_TIME]
, t.[ALLOCATION_DATE]
, t.[REQUESTED_SHIP_DATE]
, t.[ADJ_REQ_SHIP]
, t.[CANCEL_DATE]
, t.[DISPATCH_DATE]
, t.[RELEASED_DATE]
, t.[RELEASED_TIME]
, t.[PRIORITY_ORDER]
, t.[SHIPPING_LOAD_NUMBER]
, t.[ORDER_HDR_STATUS]
, t.[ORDER_STATUS]
, t.[DELIVERY_NUMBER]
, t.[DCMS_ORDER_TYPE]
, t.[ORDER_TYPE]
, t.[MATERIAL]
, t.[QUALITY]
, t.[MERCHANDISE_SIZE_1]
, t.[SPECIAL_PROCESS_CODE_1]
, t.[SPECIAL_PROCESS_CODE_2]
, t.[SPECIAL_PROCESS_CODE_3]
, t.[DIVISION]
, t.[DIVISION_DESC]
, t.[ORDER_QTY]
, t.[ORDER_SELECTED_QTY]
, t.[CARTON_PARCEL_ID]
, t.[CARTON_ID]
, t.[SHIP_DATE]
, t.[SHIP_TIME]
, t.[PACKED_DATE]
, t.[PACKED_TIME]
, t.[ADJ_PACKED_DATE]
, t.[FULL_CASE_PULL_STATUS]
, t.[CARRIER_ID]
, t.[TRAILER_ID]
, t.[WAVE_NUMBER]
, t.[DISPATCH_RELEASE_PRIORITY]
, t.[CARTON_TOTE_COUNT]
, t.[PICK_PACK_METHOD]
, t.[RELEASED_QTY]
, t.[SHIP_QTY]
, t.[MERCHANDISE_STYLE]
, t.[PICK_WAREHOUSE]
, t.[PICK_AREA]
, t.[PICK_ZONE]
, t.[PICK_AISLE]
, t.EST_DEL_DATE
, null
--,[ID]

FROM #TEMP_FACT t
  left outer join dbo.FactBP f on f.[SOURCE] = t.[SOURCE]
                              and f.[DC_ORDER_NUMBER] = t.[DC_ORDER_NUMBER]

where f.[SOURCE] is null

Adjust the join and the WHERE clause to match the natural key of the table.调整连接和WHERE子句以匹配表的自然键。

You should also take another look at your DELETE script.您还应该再看看您的DELETE脚本。 Do you really want to delete all records with a SHIP_DATE < 2019-07-31 23:59:59.000 ?您真的要删除SHIP_DATE < 2019-07-31 23:59:59.000的所有记录吗? Or should that be <= ?或者应该是<= Maybe this will work better (and simpler):也许这会更好(更简单):

DELETE FROM dbo.FactBP
WHERE SHIP_DATE < cast(dateadd(day, 1, EOMONTH(getdate(), -3)) as datetime2)
  and SHIP_DATE IS NOT NULL

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM