简体   繁体   English

SSIS / SQL Server XML

[英]SSIS/SQL Server XML

I'm working creating a XML output from SQL Server 2008R2. 我正在从SQL Server 2008R2创建XML输出。 Below the structure of the xml I want. 下面是我想要的xml的结构。 (I will try to be clear as much as possible, but if you need more information, please let me know) Issue I'm having is with the node "RecordId". (我会尽可能地清楚,但是如果你需要更多的信息,请告诉我)问题我正在使用节点“RecordId”。 This field has to be a running sequence throughout the XML, irrespective of under which node it is. 该字段必须是整个XML中的运行序列,而不管它在哪个节点下。 ie, each occurrence of the node "RecordId" under any category will be 1 value more than the immediate previous one, even if it is under a different category. 即,任何类别下的节点“RecordId”的每次出现将比前一个节点的值大1,即使它在不同的类别下。

The main constraint I have is, I will have to strictly use T-SQL, SSIS only (can have script task with VB or C#). 我的主要约束是,我将只能严格使用T-SQL,SSIS(可以使用VB或C#进行脚本任务)。

<Root>
  <includedFileHeader>
    <GenDatetime>2017-01-13T11:53:36</GenDatetime>
    <OFtype>PD</OFtype>
    <issuerID>ABCDE</issuerID>
    **<RecordId>1</RecordId>**
  </includedFileHeader>
  <includedIssuerResult>
    <issuerId>ABCDE</issuerId>
    **<RecordId>2</RecordId>**
    <includedPlanResult>
      <planId>2</planId>
      <insPlanId>123456789</insPlanId>
      **<RecordId>4</RecordId>**
      <ClassStatusType>
        <Code>A</Code>
      </ClassStatusType>
      <includedDetails>
        <DetailId>48</DetailId>
        <DetailClmId>A3456H567</DetailClmId>
        **<RecordId>4</RecordId>**
      </includedDetails>
      <includedDetails>
        <DetailId>74</DetailId>
        <DetailClmId>163364170257204</DetailClmId>
        **<RecordId>5</RecordId>**
      </includedDetails>
    </includedPlanResult>
    <includedPlanResult>
      <planId>3</planId>
      <insPlanId>343546337</insPlanId>
      **<RecordId>6</RecordId>**
      <ClassStatusType>
        <Code>A</Code>
      </ClassStatusType>
      <includedDetails>
        <DetailId>55</DetailId>
        <DetailClmId>A78947J780</DetailClmId>
        **<RecordId>7</RecordId>**
      </includedDetails>
      <includedDetails>
        <DetailId>44</DetailId>
        <DetailClmId>146545165A54</DetailClmId>
        **<RecordId>8</RecordId>**
      </includedDetails>
  </includedIssuerResult>
</Root>

I couldn't achieve even close in T-sql, but tried to use some VB/C# code in script task, using XMLReader, Streamreader/writer, nothing worked. 我甚至无法在T-sql中实现,但是尝试在脚本任务中使用一些VB / C#代码,使用XMLReader,Streamreader / writer,没有任何效果。 Any help in this is much appreciated. 非常感谢任何帮助。

In general I'd avoid to deal with XML on string level. 一般来说,我会避免在字符串级别处理XML。 In this case this might be a workaround... 在这种情况下,这可能是一种解决方法......

DECLARE @xml XML=
N'<Root>
  <includedFileHeader>
    <GenDatetime>2017-01-13T11:53:36</GenDatetime>
    <OFtype>PD</OFtype>
    <issuerID>ABCDE</issuerID>
    <RecordId>99</RecordId>
  </includedFileHeader>
  <includedIssuerResult>
    <issuerId>ABCDE</issuerId>
    <RecordId>33</RecordId>
    <includedPlanResult>
      <planId>2</planId>
      <insPlanId>123456789</insPlanId>
      <RecordId>22</RecordId>
      <ClassStatusType>
        <Code>A</Code>
      </ClassStatusType>
      <includedDetails>
        <DetailId>48</DetailId>
        <DetailClmId>A3456H567</DetailClmId>
        <RecordId>66</RecordId>
      </includedDetails>
      <includedDetails>
        <DetailId>74</DetailId>
        <DetailClmId>163364170257204</DetailClmId>
        <RecordId>11</RecordId>
      </includedDetails>
    </includedPlanResult>
    <includedPlanResult>
      <planId>3</planId>
      <insPlanId>343546337</insPlanId>
      <RecordId>6</RecordId>
      <ClassStatusType>
        <Code>A</Code>
      </ClassStatusType>
      <includedDetails>
        <DetailId>55</DetailId>
        <DetailClmId>A78947J780</DetailClmId>
        <RecordId>7</RecordId>
      </includedDetails>
      <includedDetails>
        <DetailId>44</DetailId>
        <DetailClmId>146545165A54</DetailClmId>
        <RecordId>8</RecordId>
      </includedDetails>
    </includedPlanResult>
  </includedIssuerResult>
</Root>';

--The query - 查询

WITH Splitted AS
(
    SELECT CAST('<x>' + REPLACE((SELECT CAST(@xml AS NVARCHAR(MAX)) FOR XML PATH('')),'&lt;RecordId&gt;','</x><x>') + '</x>' AS XML) AS Part
)
,Parted AS
(
    SELECT p.value(N'(./text())[1]',N'nvarchar(max)') AS line
         ,ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1 AS nr 
    FROM Splitted
    CROSS APPLY Part.nodes(N'/x') AS A(p)
)
SELECT CAST(
        (SELECT line FROM Parted WHERE nr=0)
     +  (
        SELECT  '<RecordId>' + CAST(nr AS NVARCHAR(10)) + '</RecordId>' + SUBSTRING(line,CHARINDEX('<',line,5),10000)
        FROM Parted 
        WHERE nr>0
        ORDER BY nr     
        FOR XML PATH(''),TYPE 
        ).value('(./text())[1]',N'nvarchar(max)') AS XML);

Short explanation 简短说明

The first CTE Splitted uses an XML based on-the-fly string-split approach. 第一个CTE Splitted使用基于XML 的即时字符串拆分方法。 The problem here is, that characters like < or > would let this break. 这里的问题是,像<>这样的字符会让这个中断。 That's why I use SELECT ... FOR XML PATH('') before the actual split. 这就是我在实际拆分之前使用SELECT ... FOR XML PATH('')的原因。 This will escape all forbidden characters implicitly. 这将隐式地逃避所有禁用的字符。 Next step is to split the string at <RecordId> , which is now &lt;RecordId&gt; 下一步是将字符串拆分为<RecordId> ,现在是&lt;RecordId&gt;

The second CTE Parted uses .nodes() to get a list derived table and ROW_NUMBER to get a running number. 第二个CTE Parted使用.nodes()获取列表派生表,使用ROW_NUMBER获取运行编号。

The final SELECT re-concatenates the parted lines inserting the numbered <RecordId> in the right places. 最后的SELECT重新连接在正确位置插入编号<RecordId>的分开的行。

Thank you everyone for your suggestion and help here. 感谢大家的建议和帮助。 I managed to come up with a solution (similar to Laughing Vergil's thought) (although not that neat, but able to get what I want at the sametime not very slow). 我设法提出了一个解决方案(类似于Laughing Vergil的想法)(虽然不是那么整洁,但能够在同一时间得到我想要的并不是很慢)。

This is what I did: I created a scalar UDF, that will take 2 inputs 1.XML data, 2.Name of the tag where I want the sequence (in this case it is "RecordID", just to reuse the same if in future needed for a different tag). 这就是我所做的:我创建了一个标量UDF,它将需要2个输入1.XML数据,2.我想要序列的标签名称(在这种情况下它是“RecordID”,只是重用相同的if)未来需要不同的标签)。 Also, I defaulted the " <RecordID/> " field to 0 when I generate the XML, this will be searched and replaced within. 此外,当我生成XML时,我将“ <RecordID/> ”字段默认为0,这将在其中进行搜索和替换。

What the function does: 1. Convert the input XML to VARCHAR(MAX) 2. Find the first occurrence of the " <RecordID/> " and Stuff it with #1 as sequence 3. Keep looping through the XML until it finds the last match and for each occurrence Stuff it with the appropriate loop number (@i) 该函数的作用:1。将输入XML转换为VARCHAR(MAX)2。找到第一个出现的“ <RecordID/> ”并将#1作为序列3填充。继续循环遍历XML,直到找到最后一个匹配和每次出现用适当的循环数(@i)填充它

CREATE FUNCTION dbo.UDF_UPDATE_RECORDID_XML (
  --XML Data
  @xmld VARCHAR(MAX),
  --Tag/Node name for the RecordID
  --ex: if <RecordID>0</RecordID>, then pass 'RecordID'
  @rec_tag VARCHAR(100)
)
RETURNS XML
AS
BEGIN


  --o/p param
  DECLARE @op_xmld VARCHAR(MAX)

  --processing params
  DECLARE @i INT
  DECLARE @j INT
  DECLARE @prev_i INT
  --search pattern
  DECLARE @pat VARCHAR(100) = '<' + @rec_tag + '>0</' + @rec_tag + '>'

  SELECT @op_xmld = @xmld

  --Find first occurance of the pattern in the XML
  SELECT @i = PATINDEX('%' + @pat + '%', @op_xmld)
        , @j = 1

  SELECT @prev_i = @i

  --Loop until there is no match for the pattern
  WHILE (@i > 0)
  BEGIN
        --Replace the identified substring with the sequence number in between
        SET @op_xmld = STUFF(@op_xmld, @i, LEN(@pat), '<' + @rec_tag + '>' + CAST(@j AS VARCHAR(20)) + '</' + @rec_tag + '>')

        --sequence number
        SET @j = @j + 1

        --store the current index value of the pattern
        --so the next search can happen after this point
        SET @prev_i = @i

        --Search down for the pattern from the previous index value
        SELECT @i = CHARINDEX(@pat, @op_xmld, @prev_i + LEN(@pat))
  END

      RETURN CAST(@op_xmld AS XML)
END

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM