简体   繁体   中英

XML File Validation Against Azure SQL Server in Azure Data Factory

My XML File (LOCATED IN Azure Blob Container)

<?xml version="1.0" encoding="utf-8"?>
<Details xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Person>
    <id>2</id>
    <name>XXX</name>
    <age>12</age>
  </Person>
</Details>

My Azure SQL Server Table

Table Name : UserTABLE

ID   | NAME | AGE | GENDER    
1    | JAY  | 12  |  MALE
2    | XXX  | 11  |  MALE

I want to compare the from XML file against the Above Azure SQL server (UserTABLE),if Matches then i want to Update other fields from the XML file to my UserTable , if not then i want to insert as a new row in the Azure SQL Server with all the given fields values from XML to my UserTable

Can anyone please suggest me how can i proceed ?

you can use a staging table where you will truncate/load the xml data and then call a stored proc from same ADF to do Insert/Update based on this staging data in your staging table. During next run, it will truncate the table and then do the same process again. All this can be done through your existing ADF.

Azure SQL Database has recently gained the ability to load files from Azure Blob Storage using either BULK INSERT or OPENROWSET. Start here .

Taking your example code and data, you could import the xml, shred it, then use MERGE to INSERT/UPDATE it into your main table. A simple demo:

DROP TABLE staging 
DROP TABLE #tmp
DROP TABLE yourTable

CREATE TABLE staging ( rowId INT IDENTITY PRIMARY KEY, yourXML XML );
CREATE TABLE #tmp ( id INT PRIMARY KEY, name VARCHAR(30) NOT NULL, age INT NOT NULL );

SELECT * INTO yourTable FROM #tmp

INSERT INTO staging
SELECT *
FROM OPENROWSET(
    BULK 'archive/temp.xml',
    DATA_SOURCE = 'MyAzureBlobStorageAccount',
    SINGLE_BLOB
) AS x


INSERT INTO #tmp ( id, name, age )
SELECT
    x.c.value('(id/text())[1]', 'int' ) AS id,
    x.c.value('(name/text())[1]', 'varchar(30)' ) AS name,
    x.c.value('(age/text())[1]', 'int' ) AS age
FROM staging s
    CROSS APPLY s.yourXML.nodes('Details/Person') AS x(c)


-- Merge
MERGE INTO dbo.yourTable t
USING
(
SELECT * FROM #tmp
) s ON t.id = s.id

-- Insert new records ( no match on primary key )
WHEN NOT MATCHED BY TARGET
    THEN 
        INSERT ( id, name, age )
        VALUES ( id, name, age )

WHEN MATCHED 
        AND t.name != s.name
         OR t.age != s.age

-- Update existing ( no match on primary key )
THEN UPDATE 
        SET
            t.name = s.name,
            t.age = s.age;


SELECT *
FROM dbo.yourTable

The best thing for you to do is forget about Data Factory for a moment. Write a SQL script that successfully does the setup for the above and runs successfully. When you've got that working, covert it to a stored proc. Test it. When you've got that working, you can start to think about Data Factory. You will need an output dataset, but not an input one. Work through the tutorial here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM