简体   繁体   中英

XSLT merge two XML files

I need to merge two XML files using XSLT. The transformation takes place on an XML file that contains a list of XML files to be merged.

list.xml

<?xml version="1.0" encoding="UTF-8" ?>
<files>
    <file>..\src\main\resources\testOne.xml</file>
    <file>..\src\main\resources\testTwo.xml</file>
</files>

These are my two templates to merge:

<xsl:template name="merge_nodes">
    <xsl:param name="fnNewDeept"/>
    <xsl:param name="snNewDeept"/>

    <xsl:for-each select="$fnNewDeept">
        <xsl:call-template name="merge_node">
            <xsl:with-param name="first-node" select="$fnNewDeept"/>
            <xsl:with-param name="second-node" select="$snNewDeept"/>
        </xsl:call-template>
    </xsl:for-each>
</xsl:template>

<xsl:template name="merge_node">
    <xsl:param name="first-node" />
    <xsl:param name="second-node" />

    <xsl:element name="{name(current())}">
        <xsl:for-each select="$second-node/@*">
            <xsl:copy/>
        </xsl:for-each>
        <xsl:if test="$first-node = '' and not(boolean($first-node/*) and boolean($second-node/*))">
            <xsl:value-of select="$second-node"/>
        </xsl:if>

        <xsl:for-each select="$first-node/@*">
            <xsl:copy/>
        </xsl:for-each>
        <xsl:if test="not(boolean($first-node/*) and boolean($second-node/*))">
            <xsl:value-of select="$first-node"/>
        </xsl:if>

        <xsl:choose>
            <xsl:when test="boolean($first-node/*) or boolean($second-node/*)">     
                <xsl:choose>                                                        
                    <xsl:when test="boolean($first-node/*/*)">                      
                        <xsl:call-template name="merge_nodes">                      
                            <xsl:with-param name="fnNewDeept" select="$first-node/*"/>
                            <xsl:with-param name="snNewDeept" select="$second-node/*"/>
                        </xsl:call-template>
                    </xsl:when>
                    <xsl:otherwise>
                        2. Value: <xsl:value-of select="current()/*"/>
                        2. Current: <xsl:value-of select="name(current()/*)"/>
                        2. First: <xsl:value-of select="name($first-node/*)"/>
                        2. Second: <xsl:value-of select="name($second-node/*)"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:when>
            <xsl:otherwise>
                1. Value: <xsl:value-of select="current()"/>
                1. Current: <xsl:value-of select="name(current())"/>
                1. First: <xsl:value-of select="name($first-node)"/>
                1. Second: <xsl:value-of select="name($second-node)"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:element>
</xsl:template>
  • Value, Current, First and Second only for debug reasons.

and my two XMLs:

<?xml version="1.0" encoding="UTF-8" ?>
<first x="1">
    <second param="wt" second="true">
        <third>abc</third>
        <third>def</third>
    </second>
    <fourth>
        <fifth x="1">hij</fifth>
        <fifth>klm</fifth>
    </fourth>
    <sixth>qrs</sixth>
</first>

2.

<?xml version="1.0" encoding="UTF-8" ?>
<first y="2">
    <second param="123" second="false">
        <third>asd</third>
        <third>def</third>
    </second>
    <fourth>
        <fifth y="2">tuv</fifth>
        <fifth>wxy</fifth>
    </fourth>
    <sixth>678</sixth>
    <sixth>910</sixth>
</first>

I expect the first file to be preferred, so that the second file is merged into the first. Duplicate elements should not occur.

Expected Output:

<?xml version="1.0" encoding="UTF-8" ?>
<first x="1" y="2">
    <second param="wt" second="true">
        <third>abc</third>
        <third>def</third>
        <third>asd</third>
    </second>
    <fourth>
        <fifth x="1">hij</fifth>
        <fifth>klm</fifth>
        <fifth y="2">tuv</fifth>
        <fifth>wxy</fifth>
    </fourth>
    <sixth>qrs</sixth>
    <sixth>678</sixth>
    <sixth>910</sixth>
</first>

Output i got:

<?xml version="1.0" encoding="windows-1252"?><first y="2" x="1">
<second param="wt" second="true">
                            2. Value: abc
                            2. Current: third
                            2. First: third
                            2. Second: third</second>
<fourth param="wt" second="true">
                            2. Value: hij
                            2. Current: fifth
                            2. First: third
                            2. Second: third</fourth>
<sixth param="wt" second="true">
                            2. Value: 
                            2. Current: 
                            2. First: third
                            2. Second: third</sixth>
</first>

I don't know how to run along both trees at the same time so I can copy the elements. Anybody got any ideas? I only can use Apaches XALAN. I use the newest Version 2.7.2.

Edit: Since there has already been a misunderstanding. The transformation must be applicable to similar XML files, that's the big problem.

<xsl:output method="xml" indent="yes"/>
    <xsl:variable name="doc" select="doc('merge2.xml')"/>
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="first">
        <xsl:copy>
        <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="$doc/first/@*"/>
            <xsl:apply-templates/>

        </xsl:copy>
    </xsl:template>
    <xsl:template match="second">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates/>
            <xsl:copy-of select="$doc/first/second/third[ not(.= current()/third)]"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="fourth">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates/>
            <xsl:copy-of select="$doc/first/fourth/fifth[ not(.= current()/fifth)]"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="sixth">
        <xsl:copy>

            <xsl:apply-templates/>

        </xsl:copy>
        <xsl:copy-of select="$doc/first/sixth[ not(.= current()/sixth)]"/>
    </xsl:template>
CHeck it

I have tried to understand your requirements and parts of Oliver Becker's implementation explained in http://web.archive.org/web/20160502222322/http://www2.informatik.hu-berlin.de/~obecker/XSLT/#merge and shown in http://web.archive.org/web/20160502194427/http://www2.informatik.hu-berlin.de/~obecker/XSLT/merge/merge.xslt.html by adapting the comparison of elements nodes (elements with no child elements are considered equivalent if name, namespace and content match, elements with complex content (ie element children) are considered equivalent if they have the same name and namespace) and by adapting the merging of elements to copy attributes from the second document ( $node2/@* are copied first as I think you want to have the attributes in the first doc to have preference).

I have tested that online at https://xsltfiddle.liberty-development.net/nc4NzQq with Saxon 9.8 HE, but only to be able to share the code and the result in an executable manner as an XSLT 2 or 3 processor easily allows me to inline the second document. So I have

<!--
   Merging two XML files
   Version 1.6
   LGPL (c) Oliver Becker, 2002-07-05
   obecker@informatik.hu-berlin.de
-->
<xslt:transform xmlns:xslt="http://www.w3.org/1999/XSL/Transform" xmlns:m="http://informatik.hu-berlin.de/merge" version="1.0" exclude-result-prefixes="m">


<!-- Normalize the contents of text, comment, and processing-instruction
     nodes before comparing?
     Default: yes -->
<xslt:param name="normalize" select="'yes'" />

<!-- Don't merge elements with this (qualified) name -->
<xslt:param name="dontmerge" />

<!-- If set to true, text nodes in file1 will be replaced -->
<xslt:param name="replace" select="false()" />

<!-- Variant 1: Source document looks like
     <?xml version="1.0"?>
     <merge xmlns="http://informatik.hu-berlin.de/merge">
        <file1>file1.xml</file1>
        <file2>file2.xml</file2>
     </merge>         
     The transformation sheet merges file1.xml and file2.xml.
-->
<xslt:template match="m:merge">
   <xslt:variable name="file1" select="string(m:file1)" />
   <xslt:variable name="file2" select="string(m:file2)" />
   <xslt:message>
      <xslt:text />Merging '<xslt:value-of select="$file1" />
      <xslt:text />' and '<xslt:value-of select="$file2" />'<xslt:text />
   </xslt:message>
   <xslt:if test="$file1='' or $file2=''">
      <xslt:message terminate="yes">
         <xslt:text>No files to merge specified</xslt:text>
      </xslt:message>
   </xslt:if>
   <xslt:call-template name="m:merge">
      <xslt:with-param name="nodes1" select="document($file1,/*)/node()" />
      <xslt:with-param name="nodes2" select="document($file2,/*)/node()" />
   </xslt:call-template>
</xslt:template>


<!-- Variant 2:
     The transformation sheet merges the source document with the
     document provided by the parameter "with".
-->
<xslt:param name="with" />

<!-- for testing I inline the second document -->
<xslt:param name="with-doc">
<first y="2">
    <second param="123" second="false">
        <third>asd</third>
        <third>def</third>
    </second>
    <fourth>
        <fifth y="2">tuv</fifth>
        <fifth>wxy</fifth>
    </fourth>
    <sixth>678</sixth>
    <sixth>910</sixth>
</first>    
</xslt:param>

<xslt:template match="*">
   <xslt:message>
      <xslt:text />Merging input with '<xslt:value-of select="$with" />
      <xslt:text>'</xslt:text>
   </xslt:message>
   <!--<xslt:if test="string($with)=''">
      <xslt:message terminate="yes">
         <xslt:text>No input file specified (parameter 'with')</xslt:text>
      </xslt:message>
   </xslt:if>-->

   <xslt:call-template name="m:merge">
      <xslt:with-param name="nodes1" select="/node()" />
      <!-- <xslt:with-param name="nodes2" select="document($with,/*)/node()" /> -->
      <xslt:with-param name="nodes2" select="$with-doc/node()" />
   </xslt:call-template>
</xslt:template>


<!-- ============================================================== -->

<!-- The "merge" template -->
<xslt:template name="m:merge">
   <xslt:param name="nodes1" />
   <xslt:param name="nodes2" />

   <xslt:choose>
      <!-- Is $nodes1 resp. $nodes2 empty? -->
      <xslt:when test="count($nodes1)=0">
         <xslt:copy-of select="$nodes2" />
      </xslt:when>
      <xslt:when test="count($nodes2)=0">
         <xslt:copy-of select="$nodes1" />
      </xslt:when>

      <xslt:otherwise>
         <!-- Split $nodes1 and $nodes2 -->
         <xslt:variable name="first1" select="$nodes1[1]" />
         <xslt:variable name="rest1" select="$nodes1[position()!=1]" />
         <xslt:variable name="first2" select="$nodes2[1]" />
         <xslt:variable name="rest2" select="$nodes2[position()!=1]" />
         <!-- Determine type of node $first1 -->
         <xslt:variable name="type1">
            <xslt:apply-templates mode="m:detect-type" select="$first1" />
         </xslt:variable>

         <!-- Compare $first1 and $first2 -->
         <xslt:variable name="diff-first">
            <xslt:call-template name="m:compare-nodes">
               <xslt:with-param name="node1" select="$first1" />
               <xslt:with-param name="node2" select="$first2" />
            </xslt:call-template>
         </xslt:variable>

         <xslt:choose>
            <!-- $first1 != $first2 -->
            <xslt:when test="$diff-first='!'">
               <!-- Compare $first1 and $rest2 -->
               <xslt:variable name="diff-rest">
                  <xslt:for-each select="$rest2">
                     <xslt:call-template name="m:compare-nodes">
                        <xslt:with-param name="node1" select="$first1" />
                        <xslt:with-param name="node2" select="." />
                     </xslt:call-template>
                  </xslt:for-each>
               </xslt:variable>

               <xslt:choose>
                  <!-- $first1 is in $rest2 and 
                       $first1 is *not* an empty text node  -->
                  <xslt:when test="contains($diff-rest,'=') and not($type1='text' and normalize-space($first1)='')">
                     <!-- determine position of $first1 in $nodes2
                          and copy all preceding nodes of $nodes2 -->
                     <xslt:variable name="pos" select="string-length(substring-before( $diff-rest,'=')) + 2" />
                     <xslt:copy-of select="$nodes2[position() &lt; $pos]" />
                     <!-- merge $first1 with its equivalent node -->
                     <xslt:choose>
                        <!-- Elements: merge -->
                        <xslt:when test="$type1='element'">
                           <xslt:element name="{name($first1)}" namespace="{namespace-uri($first1)}">
                              <xslt:copy-of select="$first1/namespace::*" />
                              <xslt:copy-of select="$first2/namespace::*" />
                              <xslt:copy-of select="$first2/@*"/>
                              <xslt:copy-of select="$first1/@*" />
                              <xslt:call-template name="m:merge">
                                 <xslt:with-param name="nodes1" select="$first1/node()" />
                                 <xslt:with-param name="nodes2" select="$nodes2[position()=$pos]/node()" />
                              </xslt:call-template>
                           </xslt:element>
                        </xslt:when>
                        <!-- Other: copy -->
                        <xslt:otherwise>
                           <xslt:copy-of select="$first1" />
                        </xslt:otherwise>
                     </xslt:choose>

                     <!-- Merge $rest1 and rest of $nodes2 -->
                     <xslt:call-template name="m:merge">
                        <xslt:with-param name="nodes1" select="$rest1" />
                        <xslt:with-param name="nodes2" select="$nodes2[position() &gt; $pos]" />
                     </xslt:call-template>
                  </xslt:when>

                  <!-- $first1 is a text node and replace mode was
                       activated -->
                  <xslt:when test="$type1='text' and $replace">
                     <xslt:call-template name="m:merge">
                        <xslt:with-param name="nodes1" select="$rest1" />
                        <xslt:with-param name="nodes2" select="$nodes2" />
                     </xslt:call-template>
                  </xslt:when>

                  <!-- else: $first1 is not in $rest2 or
                       $first1 is an empty text node -->
                  <xslt:otherwise>
                     <xslt:copy-of select="$first1" />
                     <xslt:call-template name="m:merge">
                        <xslt:with-param name="nodes1" select="$rest1" />
                        <xslt:with-param name="nodes2" select="$nodes2" />
                     </xslt:call-template>
                  </xslt:otherwise>
               </xslt:choose>
            </xslt:when>

            <!-- else: $first1 = $first2 -->
            <xslt:otherwise>
               <xslt:choose>
                  <!-- Elements: merge -->
                  <xslt:when test="$type1='element'">
                     <xslt:element name="{name($first1)}" namespace="{namespace-uri($first1)}">
                        <xslt:copy-of select="$first2/namespace::*" />
                        <xslt:copy-of select="$first1/namespace::*" />
                        <xslt:copy-of select="$first2/@*" />
                        <xslt:copy-of select="$first1/@*" />
                        <xslt:call-template name="m:merge">
                           <xslt:with-param name="nodes1" select="$first1/node()" />
                           <xslt:with-param name="nodes2" select="$first2/node()" />
                        </xslt:call-template>
                     </xslt:element>
                  </xslt:when>
                  <!-- Other: copy -->
                  <xslt:otherwise>
                     <xslt:copy-of select="$first1" />
                  </xslt:otherwise>
               </xslt:choose>

               <!-- Merge $rest1 and $rest2 -->
               <xslt:call-template name="m:merge">
                  <xslt:with-param name="nodes1" select="$rest1" />
                  <xslt:with-param name="nodes2" select="$rest2" />
               </xslt:call-template>
            </xslt:otherwise>
         </xslt:choose>
      </xslt:otherwise>
   </xslt:choose>
</xslt:template>


<!-- Comparing single nodes: 
     if $node1 and $node2 are equivalent then the template creates a 
     text node "=" otherwise a text node "!" -->
<xslt:template name="m:compare-nodes">
   <xslt:param name="node1" />
   <xslt:param name="node2" />
   <xslt:variable name="type1">
      <xslt:apply-templates mode="m:detect-type" select="$node1" />
   </xslt:variable>
   <xslt:variable name="type2">
      <xslt:apply-templates mode="m:detect-type" select="$node2" />
   </xslt:variable>

   <xslt:choose>
      <!-- Are $node1 and $node2 complex element nodes with the same name? -->
      <xslt:when test="$type1='element' and $type2='element' and $node1/* and $node2/* and local-name($node1)=local-name($node2) and namespace-uri($node1)=namespace-uri($node2) and name($node1)!=$dontmerge and name($node2)!=$dontmerge">
          <xslt:text>=</xslt:text>
      </xslt:when>

      <!-- Are $node1 and $node2 simple elements with the same name and same content -->
      <xslt:when test="$type1='element' and $type2='element' and not($node1/*) and not($node2/*) and local-name($node1)=local-name($node2) and namespace-uri($node1)=namespace-uri($node2) and $node1 = $node2 and name($node1)!=$dontmerge and name($node2)!=$dontmerge">
          <xslt:text>=</xslt:text>
      </xslt:when>

      <!-- Other nodes: test for the same type and content -->
      <xslt:when test="$type1!='element' and $type1=$type2 and name($node1)=name($node2) and ($node1=$node2 or ($normalize='yes' and normalize-space($node1)= normalize-space($node2)))">=</xslt:when>

      <!-- Otherwise: different node types or different name/content -->
      <xslt:otherwise>!</xslt:otherwise>
   </xslt:choose>
</xslt:template>


<!-- Type detection, thanks to M. H. Kay -->
<xslt:template match="*" mode="m:detect-type">element</xslt:template>
<xslt:template match="text()" mode="m:detect-type">text</xslt:template>
<xslt:template match="comment()" mode="m:detect-type">comment</xslt:template>
<xslt:template match="processing-instruction()" mode="m:detect-type">pi</xslt:template>

</xslt:transform>

which then gives the result

<first y="2" x="1">
    <second param="wt" second="true">
        <third>abc</third>
        <third>asd</third><third>def</third>
    </second>
    <fourth>
        <fifth x="1">hij</fifth>
        <fifth>klm</fifth>
    <fifth y="2">tuv</fifth><fifth>wxy</fifth></fourth>
    <sixth>qrs</sixth>
<sixth>678</sixth><sixth>910</sixth></first>

which is close to your wanted result, I am not sure if you can have mixed contents (ie text and element children mixed), if not, I think using xsl:strip-space and xsl:output indent="yes" , as done in https://xsltfiddle.liberty-development.net/nc4NzQq/1 , give you a clean result

<first y="2" x="1">
   <second param="wt" second="true">
      <third>abc</third>
      <third>asd</third>
      <third>def</third>
   </second>
   <fourth>
      <fifth x="1">hij</fifth>
      <fifth>klm</fifth>
      <fifth y="2">tuv</fifth>
      <fifth>wxy</fifth>
   </fourth>
   <sixth>qrs</sixth>
   <sixth>678</sixth>
   <sixth>910</sixth>
</first>

On the other hand, as I had the second sample inline where white space is stripped, it might suffice to assume that does not happen for the normal case of using the document function and then, simulating that in https://xsltfiddle.liberty-development.net/nc4NzQq/2 with xml:space="preserve" on the inline sample, the result

<first y="2" xml:space="preserve" x="1">
    <second param="wt" second="true">
        <third>abc</third>
        <third>asd</third>
        <third>def</third>
    </second>
    <fourth>
        <fifth x="1">hij</fifth>
        <fifth>klm</fifth>
    <fifth y="2">tuv</fifth>
        <fifth>wxy</fifth>
    </fourth>
    <sixth>qrs</sixth>
<sixth>678</sixth>
    <sixth>910</sixth>
</first>

also looks promising. So try to adapt respectively change it to use the with parameter and the document function, as in the original, then you might get the wanted result, at least for the two samples you have shown. It is hard to tell whether it is a generic solution as I think the whole idea of merging depends very much on a clear specification of how to compare nodes exactly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM