简体   繁体   English

T-SQL-XML到SQL Server表

[英]T-SQL - XML to SQL Server Table

Case: A table has a field with some XML code. 案例:一个表有一个带有一些XML代码的字段。

-- Some XML
'<DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[MTS]" DTS:CreationName="FLATFILE" DTS:DTSID="{296732CC-7D91-4E49-ACD4-384E03BC032E}" DTS:ObjectName="MTS">
    <DTS:PropertyExpression DTS:Name="ConnectionString">@Something</DTS:PropertyExpression>
    <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="C:\Folder\\File.csv">
            <DTS:FlatFileColumns>
                <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:MaximumWidth="50" DTS:DataType="129" DTS:TextQualified="True" DTS:ObjectName="MC" DTS:DTSID="{E87E7707-B7F7-4EC6-A2CB-98AD637A3985}" DTS:CreationName="" />
                <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="6" DTS:TextQualified="True" DTS:ObjectName="PP" DTS:DTSID="{C7B97962-3B43-40C5-82B1-F6136906CD84}" DTS:CreationName="" />
            </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
    </DTS:ObjectData>
</DTS:ConnectionManager>'
-- Some more XML

Would like to pull out some information and store it as a tabular format. 想提取一些信息并将其存储为表格格式。

Desired output 所需的输出

CreationName    ObjectName  ConnectionString        MaximumWidth    DataType    FieldName
FLATFILE        MTS         C:\Folder\\File.csv     50              129         MC
FLATFILE        MTS         C:\Folder\\File.csv     NULL            6           PP

Explanation of connecting input with the output 输入与输出连接的说明

CreationName - DTS:CreationName from DTS:ConnectionManager. i.e. FLATFILE
ObjectName - DTS:ObjectName from DTS:ConnectionManager. i.e. MTS
ConnectionString - DTS:ConnectionString from DTS:ObjectData\DTS:ConnectionManager. i.e. "C:\Folder\\File.csv"
MaximumWidth - DTS:MaximumWidth from DTS:FlatFileColumns i.e. 50 -- NOTE: MaximumWidth might not always exist
DataType - DTS:DataType from DTS:FlatFileColumns i.e. 129
FieldName - DTS:ObjectName from DTS:FlatFileColumns i.e. MC

Don't really have much experience with XML in SQL Server. 对SQL Server中的XML确实没有太多的经验。 (I'll be doing some of my own playing around and post it here if I get somewhere meaningful. :) ) (我会做一些自己的游戏,如果我觉得有意义,可以将其发布在这里。:))

UPDATED XML Example 更新的XML示例

    <DTS:Executable xmlns:DTS="www.microsoft.com/SqlServer/Dts" DTS:refId="P" DTS:CreationDate="10/01/2015 12:00:00">
  <DTS:ConnectionManagers>
    <DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[FF]" DTS:CreationName="FLATFILE" DTS:DTSID="{123}" DTS:ObjectName="FF">
      <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="Test.csv">
          <DTS:FlatFileColumns>
            <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="11" DTS:TextQualified="True" DTS:ObjectName="TestCN" DTS:DTSID="{012}" DTS:CreationName="" />
          </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
      </DTS:ObjectData>
    </DTS:ConnectionManager>
    <DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[FF2]" DTS:CreationName="FLATFILE" DTS:DTSID="{123}" DTS:ObjectName="FF2">
      <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="Test2.csv">
          <DTS:FlatFileColumns>
            <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="11" DTS:TextQualified="True" DTS:ObjectName="TestCN2" DTS:DTSID="{012}" DTS:CreationName="" />
          </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
      </DTS:ObjectData>
    </DTS:ConnectionManager>
  </DTS:ConnectionManagers>
</DTS:Executable>

You are not declaring your namespace in your root element so I substituted that. 您没有在根元素中声明您的名称空间,因此我将其替换。 This should be self extracting and run in anything I am guessing 2008 and higher, though I wrote it in 2014. Just pop it into SQL Server Management Studio: 这应该是自解压的,并且可以在我猜为2008年及以后的版本中运行,尽管我是在2014年编写的。只需将其弹出到SQL Server Management Studio中:

UPDATED 1:45 PM PST: 太平洋标准时间下午1:45更新:

Thanks to Shnugo for the simplification of the 'With XMLNamespaces'. 感谢Shnugo简化了“ With XMLNamespaces”。

DECLARE @XML XML = '
<DTS:Executable xmlns:DTS="www.microsoft.com/SqlServer/Dts" DTS:refId="P" DTS:CreationDate="10/01/2015 12:00:00">
  <DTS:ConnectionManagers>
    <DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[FF]" DTS:CreationName="FLATFILE" DTS:DTSID="{123}" DTS:ObjectName="FF">
      <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="Test.csv">
          <DTS:FlatFileColumns>
            <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="11" DTS:TextQualified="True" DTS:ObjectName="TestCN" DTS:DTSID="{012}" DTS:CreationName="" />
          </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
      </DTS:ObjectData>
    </DTS:ConnectionManager>
  </DTS:ConnectionManagers>
</DTS:Executable>'
;

WITH XMLNAMESPACES (N'www.microsoft.com/SqlServer/Dts' as DTS )
SELECT 
    y.vals.query('.') AS NodesAsExtracted
,   x.vals.value('@DTS:CreationName', 'Varchar(255)') AS CreationName
,   x.vals.value('@DTS:ObjectName', 'Varchar(255)') AS ObjectName
,   y.vals.value('@DTS:ConnectionString', 'Varchar(255)') AS ConnectionString
,   x.vals.value('@DTS:ColumnType', 'Varchar(255)') AS ColumnType
,   x.vals.value('@DTS:MaximumWidth', 'Varchar(255)') AS MaximumWidth
FROM @XML.nodes('/DTS:Executable/DTS:ConnectionManagers/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager') AS y(vals)
    CROSS APPLY @XML.nodes('/DTS:Executable/DTS:ConnectionManagers/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn') AS x(vals)


/*
The key piece is you are extracting data with a namespace, which makes things harder when querying.
You need to repeat certain 'nodes' so there is a syntax for that called originally enough 'nodes' that breaks up a 3d object like xml into multiple bits
I do one for the high level and one for the lower and then cross apply them which really is a whole world into itself I won't mention here
It should be represented as a parent 'x' and the values found 'vals'
I showed an example as is first when I query '('.')' which is everything in essence.
My namespace declaration must match on the xml that exists and the declaration.

more on nodes https://msdn.microsoft.com/en-us/library/ms188282.aspx
more on query https://msdn.microsoft.com/en-us/library/ms191474.aspx
more on value https://msdn.microsoft.com/en-us/library/ms178030.aspx
*/

This is an enhancment to the answer of djangojazz. 这是对djangojazz答案的增强。 Don't accept this, it's just a copy (but you may vote up, if you like it ;-) ... 不接受这个,它只是一个副本(但是,如果您喜欢它,您可以投票;-)...

By using WITH XMLNAMESPACES you can avoid the multiple declaration of the namespace: 通过使用WITH XMLNAMESPACES,可以避免名称空间的多重声明:

WITH XMLNAMESPACES (N'http://DTS' as DTS )
SELECT 
    x.vals.query('.') AS NodesAsExtracted
,   x.vals.value('@DTS:CreationName', 'Varchar(255)') AS CreationName
,   x.vals.value('@DTS:ObjectName', 'Varchar(255)') AS ObjectName
,   y.vals.value('@DTS:ConnectionString', 'Varchar(255)') AS ConnectionString
,   x.vals.value('@DTS:ColumnType', 'Varchar(255)') AS ColumnType
,   x.vals.value('@DTS:MaximumWidth', 'Varchar(255)') AS MaximumWidth
from @XML.nodes('/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn') AS x(vals)
    CROSS APPLY @XML.nodes('/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager') AS y(vals) 

Of course, you can also use a general-purpose language to convert XML content to tabular format for database import or delimited file export. 当然,您也可以使用通用语言将XML内容转换为表格格式,以进行数据库导入或定界文件导出。

Great as it is, SQL is a special-purpose language and not as fluid or dynamic as Java, C#, Python, PHP, Perl, VB, and still others which also carry libraries to run XPath, XSLT, and other XML-specific tasks. SQL非常棒,它是一种专用语言,不像Java,C#,Python,PHP,Perl,VB以及其他还带有运行XPath,XSLT和其他XML特定任务的库的语言那样灵活或动态。 Additionally, these languages can connect to any database to retrieve BLOB data. 此外,这些语言可以连接到任何数据库以检索BLOB数据。

For future readers, below are open-source examples using the OP's data needs. 对于未来的读者,以下是使用OP数据需求的开源示例。 You will notice use of the positional brackets with [] in xpaths, allowing more DTS:ConnectionManager elements: 您会注意到xpaths中带有[]的位置括号的使用,允许更多DTS:ConnectionManager元素:

Python (using lxml library) Python (使用lxml库)

import os
import lxml.etree as ET

cd = os.path.dirname(os.path.abspath(__file__))

xmlfile = 'DTSfile.xml'
dom = ET.parse(os.path.join(cd, xmlfile))
root = dom.getroot()

nodexpath = dom.xpath("//DTS:ConnectionManager", namespaces=root.nsmap)
dataline = []

def checkPath(xpathstr):    
    if dom.xpath(xpathstr, namespaces=root.nsmap) == []:
        return ''
    else:
        return dom.xpath(xpathstr, namespaces=root.nsmap)[0]

for i in range(1,len(nodexpath)+1):
    if i % 2 == 0: continue

    dataline = []    
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/@DTS:CreationName'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/@DTS:ObjectName'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/@DTS:ConnectionString'.format(i)))    
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:MaximumWidth'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:DataType'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:ObjectName'.format(i)))

    print(dataline)

['FLATFILE', 'FF', 'Test.csv', '', '11', 'TestCN']
['FLATFILE', 'FF2', 'Test2.csv', '', '11', 'TestCN2']

PHP (using simple_xml object) PHP (使用simple_xml对象)

$cd = dirname(__FILE__);

$xml = simplexml_load_file($cd.'/DTSfile.xml');
$xml->registerXPathNamespace('DTS', 'www.microsoft.com/SqlServer/Dts');

$values = [];
$node = $xml->xpath('//DTS:ConnectionManager');

function checkPath($xml, $xpathstr) {         
     $path = $xml->xpath($xpathstr);
     foreach ($path as $value) {
      if (count($path) > 0) {
           foreach($path as $value) {           
           return $value;         
           }
      }   
      else {
           return '';
      }       
     }         
}
$i = 1;

foreach ($node as $n){         
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/@DTS:CreationName');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/@DTS:ObjectName');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/@DTS:ConnectionString');    
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:MaximumWidth');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:DataType');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:ObjectName');     

     if ($values[1] != "") {
      echo implode(", ", $values)."\n";
     }
     $i++;
     $values = [];         
}

FLATFILE, FF, Test.csv, , 11, TestCN
FLATFILE, FF2, Test2.csv, , 11, TestCN2

R (using xml library) R (使用xml库)

library(XML) 

setwd("C:\\Path\\To\\Working\\Directory")

doc<-xmlParse("DTSfile.xml")

nodes <- as.list(xpathSApply(doc, '//DTS:ConnectionManager'))

checkPath <- function (xpathstr) {
  if (length(as.list(xpathSApply(doc, xpathstr))) > 0) {
    return(as.list(xpathSApply(doc, xpathstr)))
  } else {
    return("")
  }
}

for (i in (1:length(nodes))) {  
  if (i %% 2) {
    data <- checkPath(sprintf('//DTS:ConnectionManager[%d]/@DTS:CreationName', i))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/@DTS:ObjectName', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/@DTS:ConnectionString', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:MaximumWidth', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:DataType', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:ObjectName', i)))

    print(paste(data, collapse = ', '))
  }

}

[1] "FLATFILE, FF, Test.csv, , 11, TestCN"
[1] "FLATFILE, FF2, Test2.csv, , 11, TestCN2"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM