简体   繁体   English

在Azure数据工厂中针对Azure SQL Server进行XML文件验证

[英]XML File Validation Against Azure SQL Server in Azure Data Factory

My XML File (LOCATED IN Azure Blob Container) 我的XML文件 (位于Azure Blob容器中)

<?xml version="1.0" encoding="utf-8"?>
<Details xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Person>
    <id>2</id>
    <name>XXX</name>
    <age>12</age>
  </Person>
</Details>

My Azure SQL Server Table 我的Azure SQL Server表

Table Name : UserTABLE 表名: UserTABLE

ID   | NAME | AGE | GENDER    
1    | JAY  | 12  |  MALE
2    | XXX  | 11  |  MALE

I want to compare the from XML file against the Above Azure SQL server (UserTABLE),if Matches then i want to Update other fields from the XML file to my UserTable , if not then i want to insert as a new row in the Azure SQL Server with all the given fields values from XML to my UserTable 我想将XML文件与上述Azure SQL服务器(UserTABLE)进行比较,如果匹配,那么我想将XML文件中的其他字段更新 到我的UserTable中 ,否则,我想作为新行插入 Azure SQL中从XML到我的UserTable的所有给定字段值的服务器

Can anyone please suggest me how can i proceed ? 谁能建议我该如何进行?

you can use a staging table where you will truncate/load the xml data and then call a stored proc from same ADF to do Insert/Update based on this staging data in your staging table. 您可以使用登台表,在其中将截断/加载xml数据,然后从同一ADF调用存储的proc,以根据登台表中的此登台数据进行插入/更新。 During next run, it will truncate the table and then do the same process again. 在下一次运行期间,它将截断表,然后再次执行相同的过程。 All this can be done through your existing ADF. 所有这些都可以通过您现有的ADF完成。

Azure SQL Database has recently gained the ability to load files from Azure Blob Storage using either BULK INSERT or OPENROWSET. Azure SQL数据库最近获得了使用BULK INSERT或OPENROWSET从Azure Blob存储加载文件的功能。 Start here . 这里开始。

Taking your example code and data, you could import the xml, shred it, then use MERGE to INSERT/UPDATE it into your main table. 以示例代码和数据为例,您可以导入xml,将其切碎,然后使用MERGE将其INSERT/UPDATE到主表中。 A simple demo: 一个简单的演示:

DROP TABLE staging 
DROP TABLE #tmp
DROP TABLE yourTable

CREATE TABLE staging ( rowId INT IDENTITY PRIMARY KEY, yourXML XML );
CREATE TABLE #tmp ( id INT PRIMARY KEY, name VARCHAR(30) NOT NULL, age INT NOT NULL );

SELECT * INTO yourTable FROM #tmp

INSERT INTO staging
SELECT *
FROM OPENROWSET(
    BULK 'archive/temp.xml',
    DATA_SOURCE = 'MyAzureBlobStorageAccount',
    SINGLE_BLOB
) AS x


INSERT INTO #tmp ( id, name, age )
SELECT
    x.c.value('(id/text())[1]', 'int' ) AS id,
    x.c.value('(name/text())[1]', 'varchar(30)' ) AS name,
    x.c.value('(age/text())[1]', 'int' ) AS age
FROM staging s
    CROSS APPLY s.yourXML.nodes('Details/Person') AS x(c)


-- Merge
MERGE INTO dbo.yourTable t
USING
(
SELECT * FROM #tmp
) s ON t.id = s.id

-- Insert new records ( no match on primary key )
WHEN NOT MATCHED BY TARGET
    THEN 
        INSERT ( id, name, age )
        VALUES ( id, name, age )

WHEN MATCHED 
        AND t.name != s.name
         OR t.age != s.age

-- Update existing ( no match on primary key )
THEN UPDATE 
        SET
            t.name = s.name,
            t.age = s.age;


SELECT *
FROM dbo.yourTable

The best thing for you to do is forget about Data Factory for a moment. 您要做的最好的事情是暂时忘掉Data Factory。 Write a SQL script that successfully does the setup for the above and runs successfully. 编写一个SQL脚本,该脚本可以成功完成上述设置并成功运行。 When you've got that working, covert it to a stored proc. 完成该工作后,将其隐藏到存储的proc中。 Test it. 测试一下。 When you've got that working, you can start to think about Data Factory. 完成这些工作后,您可以开始考虑Data Factory。 You will need an output dataset, but not an input one. 您将需要一个输出数据集,但不需要输入数据集。 Work through the tutorial here . 此处完成本教程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM