How to drop duplicates in source data set (JSON) and load data into azure SQL DB in azure data factory

Question

I have a table in SQL DB with primary key fields. Now i am using a copy activity in azure data factory with source dataset(JSON).

We are writing this data into sink dataset(SQL DB) but the pipeline is failing with the below error

"message": "'Type=System.Data.SqlClient.SqlException,Message=Violation of 
 PRIMARY KEY constraint &apos;PK__field__399771B9251AD6D4&apos;. Cannot 
 insert duplicate key in object &apos;dbo.crop_original_new&apos;. The 
 duplicate key value is (9161&#44; en).\r\nThe statement has been 
 terminated.,Source=.Net SqlClient Data Provider,SqlErrorNumber=2627,Class=14,ErrorCode=-2146232060,State=1,Errors= 
[{Class=14,Number=2627,State=1,Message=Violation of PRIMARY KEY 
constraint &apos;PK__field__399771B9251AD6D4&apos;. Cannot insert 
duplicate key in object &apos;Table&apos;. The duplicate key value is 
(9161&#44; en).,},{Class=0,Number=3621,State=0,Message=The statement has 
been terminated.,},],'",

Answer 1

Well, the finest solution would be:

Create a staging table in your SQL environment stg_table (this table should have a different key policy)
Load data from JSON source to stg_table
Write a stored procedure to clean data from duplicates and to load into your destination table

Or if you are familiar with Mapping Data Flows in ADF you can check this article by Mark Kromer

Answer 2

You can use Fault tolerance setting provided in copy activity to skip incompatible rows.

Setting image

How to drop duplicates in source data set (JSON) and load data into azure SQL DB in azure data factory

Question

2 answers

solution1
0 2019-08-07 14:30:01

solution2
0 2019-08-15 20:05:43

How to drop duplicates in source data set (JSON) and load data into azure SQL DB in azure data factory

Question

2 answers

solution1 0 2019-08-07 14:30:01

solution2 0 2019-08-15 20:05:43

solution1
0 2019-08-07 14:30:01

solution2
0 2019-08-15 20:05:43