简体   繁体   English

XSLT 1.0-将具有子节点的同级节点合并到新的复合节点中

[英]XSLT 1.0 - Merge sibling nodes with child nodes into new composite nodes

I had a tough time formulating the question title. 我很难确定问题的标题。 Maybe the example will make more sense. 也许这个例子会更有意义。

Suppose I have an XML document that looks like this from system A: 假设我有一个来自系统A的XML文档,如下所示:

<root>
    <phone_numbers>
        <phone_number type="work">123-WORK</phone_number>
        <phone_number type="home">456-HOME</phone_number>
        <phone_number type="work">789-WORK</phone_number>
        <phone_number type="other">012-OTHER</phone_number>
    </phone_numbers>
    <email_addresses>
        <email_address type="home">a@home</email_address>
        <email_address type="other">b@other</email_address>
        <email_address type="home">c@home</email_address>
        <email_address type="work">d@work</email_address>
        <email_address type="other">e@other</email_address>
        <email_address type="other">f@other</email_address>
    </email_addresses>
</root>

And I have to fit these into a structure like this so they can be used in system B: 而且我必须将它们放入这样的结构中,以便可以在系统B中使用它们:

<root>
    <addresses>
        <address name="work1">
            <phone_number>123-WORK</phone_number>
            <email_address>d@work</email_address>
        </address>
        <address name="work2">
            <phone_number>789-WORK</phone_number>
        </address>
        <address name="other1">
            <phone_number>012-OTHER</phone_number>
            <email_address>b@other</email_address>
        </address>
        <address name="other2">
            <email_address>e@other</email_address>
        </address>
        <address name="other3">
            <email_address>f@other</email_address>
        </address>
        <address name="home1">
            <phone_number>456-HOME</phone_number>
            <email_address>a@home</email_address>
        </address>
        <address name="home2">
            <email_address>c@home</email_address>
        </address>
    </addresses>
</root>

There can be any number (from 0 to infinity, as far as I know) of email addresses of each type. 每种类型的电子邮件地址可以有任意数量(据我所知,从0到无穷大)。 There can also be any number of phone numbers of each type, and the number of phone numbers of one type does not have to match the number of email addresses of the same type. 每种类型也可以有任意数量的电话号码,并且一种类型的电话号码不必与相同类型的电子邮件地址的数量匹配。

The email addresses and phone numbers in the first document aren't really related to each other, except that they are entered in the order they were added to system A. 第一个文档中的电子邮件地址和电话号码之间并没有真正的关联,只是它们的输入顺序与添加到系统A的顺序相同。

I have to pair the emails and phone numbers up by type to fit into system B, and I would like to pair them so that the first phone number of type X is paired with the first email address of type X and so that no phone number of type X is paired with an email of a type other than X. 我必须按类型将电子邮件和电话号码配对以适合系统B,并且我想将它们配对,以便将X类型的第一个电话号码与X类型的第一个电子邮件地址配对,这样就没有电话号码X类型的电子邮件与X类型以外的电子邮件配对。

Since I have to pair them up, and since the order they were entered into the system is the closest I'll get to finding a relationship between the pairs, I would like to order them this way. 由于我必须将它们配对,并且由于它们进入系统的顺序是找到它们之间关系的最接近的顺序,因此我想以此方式对其进行排序。 I'll have to tell the users to go over the results, to make sure they make sense, but I have to pair them - no choice. 我必须告诉用户仔细检查结果,以确保它们有意义,但我必须将它们配对-别无选择。

To complicate matters, my actual XML document has more nodes that I'll need to merge with phone_numbers and email_addresses, and I have more than two @types . 使事情复杂化的是,我的实际XML文档具有更多需要与phone_numbers和email_addresses合并的节点,并且我有两个以上的@types

One other note: I'm already calculating the maximum number of nodes with any given @type , so with my example docs, I know that the maximum number of <address> nodes of a single @type is three (three <email_address> nodes with @type=other = three <address> nodes with @name=otherX ). 另一注:我已经计算了任何给定@type的最大节@type ,因此在我的示例文档中,我知道单个@type<address>节点的最大数目为三个(三个<email_address>节点@type=other =三个<address>节点,其中@name=otherX )。

This stylesheet: 此样式表:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="byType" match="/root/*/*" use="@type" />
    <xsl:key name="phoneByType" match="phone_numbers/phone_number"
        use="@type" />
    <xsl:key name="emailByType" match="email_addresses/email_address"
        use="@type" />
    <xsl:template match="/">
        <root>
            <addresses>
                <xsl:apply-templates />
            </addresses>
        </root>
    </xsl:template>
    <xsl:template match="/root/*/*" />
    <xsl:template
        match="/root/*/*[generate-id()=generate-id(key('byType', @type)[1])]">
        <xsl:apply-templates select="key('phoneByType', @type)"
            mode="wrap" />
        <xsl:apply-templates
            select="key('emailByType', @type)
                [position() > count(key('phoneByType', @type))]"
            mode="wrap" />
    </xsl:template>
    <xsl:template match="phone_numbers/phone_number" mode="wrap">
        <xsl:variable name="pos" select="position()" />
        <address name="{concat(@type, $pos)}">
            <xsl:apply-templates select="." mode="out" />
            <xsl:apply-templates select="key('emailByType', @type)[$pos]"
                mode="out" />
        </address>
    </xsl:template>
    <xsl:template match="email_addresses/email_address" mode="wrap">
        <address
            name="{concat(@type, 
                          position() + count(key('phoneByType', @type)))}">
            <xsl:apply-templates select="." mode="out" />
        </address>
    </xsl:template>
    <xsl:template match="/root/*/*" mode="out">
        <xsl:copy>
            <xsl:apply-templates />
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

On this input: 在此输入上:

<root>
    <phone_numbers>
        <phone_number type="work">123-WORK</phone_number>
        <phone_number type="home">456-HOME</phone_number>
        <phone_number type="work">789-WORK</phone_number>
        <phone_number type="other">012-OTHER</phone_number>
    </phone_numbers>
    <email_addresses>
        <email_address type="home">a@home</email_address>
        <email_address type="other">b@other</email_address>
        <email_address type="home">c@home</email_address>
        <email_address type="work">d@work</email_address>
        <email_address type="other">e@other</email_address>
        <email_address type="other">f@other</email_address>
        <email_address type="test">g@other</email_address>
    </email_addresses>
</root>

Produces: 产生:

<root>
    <addresses>
        <address name="work1">
            <phone_number>123-WORK</phone_number>
            <email_address>d@work</email_address>
        </address>
        <address name="work2">
            <phone_number>789-WORK</phone_number>
        </address>
        <address name="home1">
            <phone_number>456-HOME</phone_number>
            <email_address>a@home</email_address>
        </address>
        <address name="home2">
            <email_address>c@home</email_address>
        </address>
        <address name="other1">
            <phone_number>012-OTHER</phone_number>
            <email_address>b@other</email_address>
        </address>
        <address name="other2">
            <email_address>e@other</email_address>
        </address>
        <address name="other3">
            <email_address>f@other</email_address>
        </address>
        <address name="test1">
            <email_address>g@other</email_address>
        </address>
    </addresses>
</root>

Explanation: 说明:

  • There are three groups: 1) all contact info by type; 分为三类:1)所有联系人信息(按类型); 2) all phone numbers by type; 2)所有电话号码(按类型); 3) all email addresses by type 3)所有电子邮件地址按类型
  • The first group is used to get the first occurrence of each type 第一组用于获取每种类型的首次出现
  • Then we go through each of the phone numbers, pairing with any email address in the same position 然后,我们浏览每个电话号码,并与同一位置的任何电子邮件地址配对
  • Finally, we account for all of the email addresses that did not have a corresponding phone number 最后,我们说明所有没有相应电话号码的电子邮件地址

This transformation is quite simpler (only 3 templates and no modes): 这种转换非常简单 (只有3个模板,没有模式):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kTypeByVal" match="@type" use="."/>

 <xsl:key name="kPhNumByType" match="phone_number"
  use="@type"/>

 <xsl:key name="kAddrByType" match="email_address"
  use="@type"/>

 <xsl:variable name="vallTypes" select=
 "/*/*/*/@type
          [generate-id()
          =
           generate-id(key('kTypeByVal',.)[1])
          ]"/>

 <xsl:template match="/">
  <root>
   <addresses>
    <xsl:apply-templates select="$vallTypes"/>
   </addresses>
  </root>
 </xsl:template>

 <xsl:template match="@type">
  <xsl:variable name="vcurType" select="."/>
  <xsl:variable name="vPhoneNums" select="key('kPhNumByType',.)"/>
  <xsl:variable name="vAddresses" select="key('kAddrByType',.)"/>

  <xsl:variable name="vLonger" select=
  "$vPhoneNums[count($vPhoneNums) > count($vAddresses)]
  |
   $vAddresses[not(count($vPhoneNums) > count($vAddresses))]
  "/>

  <xsl:for-each select="$vLonger">
   <xsl:variable name="vPos" select="position()"/>
   <address name="{$vcurType}{$vPos}">
    <xsl:apply-templates select="$vPhoneNums[position()=$vPos]"/>
    <xsl:apply-templates select="$vAddresses[position()=$vPos]"/>
   </address>
  </xsl:for-each>
 </xsl:template>

 <xsl:template match="phone_number|email_address">
  <xsl:copy>
   <xsl:copy-of select="node()"/>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document (and any document with the described properties): 当应用于提供的XML文档 (以及具有描述属性的任何文档)时:

<root>
    <phone_numbers>
        <phone_number type="work">123-WORK</phone_number>
        <phone_number type="home">456-HOME</phone_number>
        <phone_number type="work">789-WORK</phone_number>
        <phone_number type="other">012-OTHER</phone_number>
    </phone_numbers>
    <email_addresses>
        <email_address type="home">a@home</email_address>
        <email_address type="other">b@other</email_address>
        <email_address type="home">c@home</email_address>
        <email_address type="work">d@work</email_address>
        <email_address type="other">e@other</email_address>
        <email_address type="other">f@other</email_address>
    </email_addresses>
</root>

the wanted, correct result is produced : 产生想要的正确结果

<root>
   <addresses>
      <address name="work1">
         <phone_number>123-WORK</phone_number>
         <email_address>d@work</email_address>
      </address>
      <address name="work2">
         <phone_number>789-WORK</phone_number>
      </address>
      <address name="home1">
         <phone_number>456-HOME</phone_number>
         <email_address>a@home</email_address>
      </address>
      <address name="home2">
         <email_address>c@home</email_address>
      </address>
      <address name="other1">
         <phone_number>012-OTHER</phone_number>
         <email_address>b@other</email_address>
      </address>
      <address name="other2">
         <email_address>e@other</email_address>
      </address>
      <address name="other3">
         <email_address>f@other</email_address>
      </address>
   </addresses>
</root>

Explanation : 说明

  1. All different values of the type attribute are collected in the $vallTypes variable, using the Muenchian method for grouping. 使用Muenchian方法进行分组,将type属性的所有不同值收集$vallTypes变量中。

  2. For every distinct value found in 1. above, an <address> element is output as follows. 对于上面1.中找到的每个不同值, <address>元素输出如下。

  3. A name attribute is generated with value the concatenation of the current type and the current position() . 生成一个name属性 ,其值是当前type和当前position()的串联。

  4. Two nodesets are captured in variables : one containing all phone_number elements that has this specific value of their type attribute, and another containing all email_address elements that has this specific value of their type attribute. 在变量中捕获了两个节点集 :一个包含所有具有其type属性特定值的phone_number元素,另一个包含所有具有type属性特定值的email_address元素。

  5. For every element of the longer of these two node-sets one element or (if possible a pair of elements from the two node-sets) is/are used to be generated (omitting the type attribute`) in the final output. 对于这两个节点集中较长者的每个元素,在最终输出中将使用一个元素或(如果可能,则是两个节点集中的一对元素)生成 (省略type属性`)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM