简体   繁体   English

XML / XSLT / Access / VBA:在导入到Access数据库之前,如何将所有子元素(甚至是未知元素)合并为一个?

[英]XML/XSLT/Access/VBA: how can i merge all child elements (even unknown elements) into one, before importing to Access database?

CURRENT XML: 当前的XML:

<?xml version="1.0"?>

<form1>
   <page1>
      <first_name></first_name>
      <last_name></last_name>
      .
      .
   </page1>
   <page2>
      <address></address>
      <phone_number></phone_number>
      .
      .
   </page2>
   <page3>
      <company_name></company_name>
      <job_title></job_title>
      .
      .
   </page3>
</form1>

DESIRED XML - i want to merge all child elements and rename the parent: 期望的XML-我想合并所有子元素并重命名父元素:

<?xml version="1.0"?>

<form>
   <page>
      <first_name></first_name>
      <last_name></last_name>
      .
      .
      <address></address>
      <phone_number></phone_number>
      .
      .
      <company_name></company_name>
      <job_title></job_title>
      .
      .
   </page>
</form>

then, since i have thousands of XML files with some unknown elements, i want to find all of them before bulk importing the XML into Access database, because any new elements in subsequent files will be dropped if they are not defined in the schema. 然后,由于我有成千上万个带有未知元素的XML文件,因此我想在将XML批量导入Access数据库之前找到所有这些文件,因为如果未在架构中定义后续文件中的任何新元素,则它们将被删除。

not all child elements are known. 并非所有子元素都是已知的。 not all file names are known. 并非所有文件名都是已知的。

so, how can i check all files for all elements, fill the Access table with them all, then bulk import all the XML records to fit into the desired schema as shown above? 因此,我如何检查所有文件的所有元素,用它们全部填充Access表,然后批量导入所有XML记录以适合所需的模式,如上所示?

EDIT: 编辑:

ok, i see - there are no attributes. 好的,我知道-没有属性。 what i meant was all child elements. 我的意思是所有子元素。 thanks for pointing that out Oded, I updated the question with corrections. 感谢您指出奥德(Oded),我对问题进行了更正。

this is the VBA code I am using in Access for bulk importing the files: 这是我在Access中用于批量导入文件的VBA代码:

 Private Sub cmdImport_Click()
 Dim strFile As String 'Filename
 Dim strFileList() As String 'File Array
 Dim intFile As Integer 'File Number
 Dim strPath As String ' Path to file folder

 strPath = "C:\Users\Main\Desktop\XML-files"
 strFile = Dir(strPath & "*.XML")

 While strFile <> ""
      'add files to the list
     intFile = intFile + 1
     ReDim Preserve strFileList(1 To intFile)
     strFileList(intFile) = strFile
     strFile = Dir()
 Wend
 'see if any files were found
 If intFile = 0 Then
     MsgBox "No files found"
     Exit Sub
 End If

 'cycle through the list of files
 For intFile = 1 To UBound(strFileList)
     Application.ImportXML strPath & strFileList(intFile), acAppendData

 Next intFile
MsgBox "Import Completed"

End Sub

i can use the stylesheet to transform the XML as such: 我可以使用样式表将XML转换为:

  For intFile = 1 To UBound(strFileList)
     Application.TransformXML strPath & strFileList(intFile), _
     "C:\Users\Main\Desktop\stylesheet2.xslt", _
     "C:\Users\Main\Desktop\temp.xml", True
     Application.ImportXML "C:\Users\Main\Desktop\temp.xml", acAppendData
   Next intFile

 MsgBox "Import Completed"
End Sub

however, it does not merge all the file elements into one table. 但是,它不会将所有文件元素合并到一个表中。 am i missing something? 我想念什么吗? do i need to save a variable list? 我需要保存变量列表吗? or create some attribute ids? 或创建一些属性ID?

EDIT : From comments 编辑 :从评论

my file names are 1.xml, 2.xml, 3.xml, 4.xml, etc. But like i said have thousands 我的文件名为1.xml,2.xml,3.xml,4.xml等。但是就像我说的有数千个

This stylesheet produces the output that you described. 此样式表产生您描述的输出。

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />

<xsl:template match="/">
<!--generate standard document element and it's child element-->
<form>
    <page>
            <!--Apply templates to children of document element's, child element's, children-->
        <xsl:apply-templates select="/*/*/node()" />
    </page>
</form>
</xsl:template>

<!--Identity template copies all content forward-->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

If you just want to copy the elements under the page elements, rather than any node() (element, text, comment, or processing instruction), then you could change the XPATH from: /*/*/node() to: /*/*/* 如果您只想复制页面元素下的元素,而不是任何node()(元素,文本,注释或处理指令),则可以将XPATH从: /*/*/node()更改为: /*/*/*

If you really want to just copy all the contents of the page{N} elements, then this transformation is probably the shortest : 如果您真的只想复制 page{N}元素的所有内容,那么这种转换可能是最短的

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
   <form>
    <page>
      <xsl:copy-of select="/*/*/node()"/>
    </page>
   </form>
 </xsl:template>
</xsl:stylesheet>

Suppose this input documents: 假设此输入文件:

1.xml 1.xml

<form1>
   <page1>
        <first_name>D</first_name>
        <last_name>E</last_name>
   </page1>
   <page2>
        <address>F</address>
        <phone_number>1</phone_number>
   </page2>
   <page3>
        <company_name>G</company_name>
   </page3>
</form1>

2.xml 2.xml

<form2>
   <page1>
        <first_name>A</first_name>
   </page1>
   <page2>
        <address>B</address>
   </page2>
   <page3>
        <company_name>C</company_name>
        <job_title>H</job_title>
   </page3>
</form2>

This stylesheet: 此样式表:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kElementByName" match="/*/*/*" use="name()"/>
    <xsl:param name="pMaxFileNumber" select="2"/>
    <xsl:template match="/">
        <xsl:variable name="vFieldsNames">
            <xsl:call-template name="names">
                <xsl:with-param name="pFrom" select="1"/>
                <xsl:with-param name="pTo" select="$pMaxFileNumber"/>
                <xsl:with-param name="pFieldsNames" select="'|'"/>
            </xsl:call-template>
        </xsl:variable>
        <form>
            <xsl:call-template name="merge">
                <xsl:with-param name="pFrom" select="1"/>
                <xsl:with-param name="pTo" select="$pMaxFileNumber"/>
                <xsl:with-param name="pFieldsNames" select="$vFieldsNames"/>
            </xsl:call-template>
        </form>
    </xsl:template>
    <xsl:template name="names">
        <xsl:param name="pFrom"/>
        <xsl:param name="pTo"/>
        <xsl:param name="pFieldsNames"/>
        <xsl:choose>
            <xsl:when test="$pFrom = $pTo">
                <xsl:value-of select="$pFieldsNames"/>
                <xsl:apply-templates
                     select="document(concat($pFrom,'.xml'),/)/*/*/*
                                       [count(.|key('kElementByName',
                                                    name())[1])=1]
                                       [not(contains($pFieldsNames,
                                                     concat('|',name(),'|')))]"
                     mode="names"/>
            </xsl:when>
            <xsl:otherwise>
                <xsl:variable name="vNewTop"
                              select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
                <xsl:variable name="vNewFieldsNames">
                    <xsl:call-template name="names">
                        <xsl:with-param name="pFrom" select="$pFrom"/>
                        <xsl:with-param name="pTo" select="$vNewTop"/>
                        <xsl:with-param name="pFieldsNames"
                                        select="$pFieldsNames"/>
                    </xsl:call-template>
                </xsl:variable>
                <xsl:call-template name="names">
                    <xsl:with-param name="pFrom" select="$vNewTop + 1"/>
                    <xsl:with-param name="pTo" select="$pTo"/>
                    <xsl:with-param name="pFieldsNames"
                                    select="$vNewFieldsNames"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="merge">
        <xsl:param name="pFrom"/>
        <xsl:param name="pTo"/>
        <xsl:param name="pFieldsNames"/>
        <xsl:choose>
            <xsl:when test="$pFrom = $pTo">
                <page>
                    <xsl:apply-templates
                     select="document(concat($pFrom,'.xml'),/)/*/*[1]/*[1]">
                        <xsl:with-param name="pFieldsNames"
                                        select="$pFieldsNames"/>
                    </xsl:apply-templates>
                </page>
            </xsl:when>
            <xsl:otherwise>
                <xsl:variable name="vNewTop"
                              select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
                <xsl:call-template name="merge">
                    <xsl:with-param name="pFrom" select="$pFrom"/>
                    <xsl:with-param name="pTo" select="$vNewTop"/>
                    <xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
                </xsl:call-template>
                <xsl:call-template name="merge">
                    <xsl:with-param name="pFrom" select="$vNewTop + 1"/>
                    <xsl:with-param name="pTo" select="$pTo"/>
                    <xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template match="/*/*">
        <xsl:param name="pFieldsNames"/>
        <xsl:apply-templates select="*[1]">
            <xsl:with-param name="pFieldsNames" select="$pFieldsNames"/>
        </xsl:apply-templates>
    </xsl:template>
    <xsl:template match="/*/*/*" name="copy">
        <xsl:param name="pFieldsNames"/>
        <xsl:copy>
            <xsl:value-of select="."/>
        </xsl:copy>
        <xsl:variable name="vName" select="concat('|',name(),'|')"/>
        <xsl:apply-templates select="following::*[1]">
            <xsl:with-param name="pFieldsNames"
                            select="concat(substring-before($pFieldsNames,
                                                            $vName),
                                           '|',
                                           substring-after($pFieldsNames,
                                                           $vName))"/>
        </xsl:apply-templates>
    </xsl:template>
    <xsl:template match="/*/*[last()]/*[last()]">
        <xsl:param name="pFieldsNames"/>
        <xsl:call-template name="copy"/>
        <xsl:variable name="vName" select="concat('|',name(),'|')"/>
        <xsl:call-template name="empty">
            <xsl:with-param name="pFieldsNames"
                            select="substring(
                                      concat(substring-before($pFieldsNames,
                                                              $vName),
                                             '|',
                                             substring-after($pFieldsNames,
                                                             $vName)),
                                      2)"/>
        </xsl:call-template>
    </xsl:template>
    <xsl:template match="/*/*/*" mode="names">
        <xsl:value-of select="concat(name(),'|')"/>
    </xsl:template>
    <xsl:template name="empty">
        <xsl:param name="pFieldsNames"/>
        <xsl:if test="$pFieldsNames!=''">
            <xsl:element name="{substring-before($pFieldsNames,'|')}"/>
            <xsl:call-template name="empty">
                <xsl:with-param name="pFieldsNames"
                           select="substring-after($pFieldsNames,'|')"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

Output: 输出:

<form>
    <page>
        <first_name>D</first_name>
        <last_name>E</last_name>
        <address>F</address>
        <phone_number>1</phone_number>
        <company_name>G</company_name>
        <job_title />
    </page>
    <page>
        <first_name>A</first_name>
        <address>B</address>
        <company_name>C</company_name>
        <job_title>H</job_title>
        <last_name />
        <phone_number />
    </page>
</form>

Note : If this blows up your memory, then you need to split this in two stylesheets: first, output the names; 注意 :如果这消耗了您的内存,则需要将其拆分为两个样式表:首先,输出名称;第二,输出名称。 second, merge. 第二,合并。 If you can't pass param with Application.TransformXML , then the max number of files is fixed. 如果您无法通过Application.TransformXML传递参数,则最大文件数是固定的。 Also, there must not be any hole: if max number of files is 3, 2.xml can't be missed (this is because fn:document throws an error) 另外,一定不能有任何漏洞:如果最大文件数为3,则不能错过2.xml (这是因为fn:document引发错误)

EDIT : For a two pass transformation. 编辑 :对于两遍转换。

This stylesheet with any input (not used): 此样式表具有任何输入(未使用):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:param name="pMaxFileNumber" select="2"/>
    <xsl:template match="/">
        <form>
            <xsl:call-template name="copy">
                <xsl:with-param name="pFrom" select="1"/>
                <xsl:with-param name="pTo" select="$pMaxFileNumber"/>
            </xsl:call-template>
        </form>
    </xsl:template>
    <xsl:template name="copy">
        <xsl:param name="pFrom"/>
        <xsl:param name="pTo"/>
        <xsl:choose>
            <xsl:when test="$pFrom = $pTo">
                <page>
                    <xsl:copy-of 
                     select="document(concat($pFrom,'.xml'),/)/*/*/*"/>
                </page>
            </xsl:when>
            <xsl:otherwise>
                <xsl:variable name="vMiddle"
                      select="floor(($pTo - $pFrom) div 2) + $pFrom"/>
                <xsl:call-template name="copy">
                    <xsl:with-param name="pFrom" select="$pFrom"/>
                    <xsl:with-param name="pTo" select="$vMiddle"/>
                </xsl:call-template>
                <xsl:call-template name="copy">
                    <xsl:with-param name="pFrom" select="$vMiddle + 1"/>
                    <xsl:with-param name="pTo" select="$pTo"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

Output: 输出:

<form>
    <page>
        <first_name>D</first_name>
        <last_name>E</last_name>
        <address>F</address>
        <phone_number>1</phone_number>
        <company_name>G</company_name>
    </page>
    <page>
        <first_name>A</first_name>
        <address>B</address>
        <company_name>C</company_name>
        <job_title>H</job_title>
    </page>
</form>

And this stylesheet: 这个样式表:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kElementByName" match="/*/*/*" use="name()"/>
    <xsl:variable name="vElements"
                  select="/*/*/*[count(.|key('kElementByName',name())[1])=1]"/>
    <xsl:template match="form">
        <xsl:copy>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="page">
        <xsl:copy>
            <xsl:apply-templates select="$vElements">
                <xsl:with-param name="pContext" select="."/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="/*/*/*">
        <xsl:param name="pContext"/>
        <xsl:element name="{name()}">
            <xsl:value-of select="$pContext/*[name()=name(current())]"/>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

With previus output as input, result: 使用previus输出作为输入,结果:

<form>
    <page>
        <first_name>D</first_name>
        <last_name>E</last_name>
        <address>F</address>
        <phone_number>1</phone_number>
        <company_name>G</company_name>
        <job_title></job_title>
    </page>
    <page>
        <first_name>A</first_name>
        <last_name></last_name>
        <address>B</address>
        <phone_number></phone_number>
        <company_name>C</company_name>
        <job_title>H</job_title>
    </page>
</form>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM