简体   繁体   English

通过字符串中的字符和 XSLT 中的节点进行双重迭代 - 如何使用递归进行?

[英]Double iteration through chars in a string and nodes in XSLT - how to do it with recursion?

I need to replace some characters in a string stored in <p> with <app> nodes which contain a matching char (or substring) in a child element <lem> .我需要用<app>节点替换存储在<p>中的字符串中的一些字符,这些节点在子元素<lem>中包含匹配的字符(或子字符串)。 Each <app> contains only one <lem> at the top, and an arbitrary number of other nodes below it.每个<app>在顶部只包含一个<lem> ,在它下面有任意数量的其他节点。 Each <app> only refers to a single character in the text, and they are placed in order.每个<app>只引用文本中的单个字符,并且它们是按顺序排列的。

I am new to XSLT, and cannot come up with a good recursion to do this -- I'm kind of stuck in the java or MATLAB mindset of iterating over i = 1:n and j= 1:m , and I understand that this is no good for taking advantage of recursion in XSLT... Thanks for your help!!!我是 XSLT 的新手,无法想出一个好的递归来做到这一点——我有点陷入 java 或 MATLAB 迭代i = 1:nj= 1:m的心态,我明白这不利于利用 XSLT 中的递归...感谢您的帮助!!!

<div>
            <p>SOMEWONDERFULOLDTEXT</p>
            <app>
               <lem>O</lem>
               <rdg>Ø</rdg>
            </app>
            <app>
               <lem>W</lem>
               <rdg>V</rdg>
            </app>
            <app>
               <lem>O</lem>
               <rdg>Ö</rdg>
            </app>
            <app>
               <lem>E</lem>
               <rdg>Ë</rdg>
               <rdg>ę</rdg>
            </app>
         </div>

My stylesheet so far is this, but I know it doesn't work because it is iterating through the text for every <app> , which is wrong.到目前为止,我的样式表是这样的,但我知道它不起作用,因为它遍历每个<app>的文本,这是错误的。

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs tei" version="3.0">

    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
    </xsl:template>

    <!-- now build the apparatus -->
    <xsl:template match="tei:div">
        <xsl:param name="thisBlock" select="./tei:p/node()"/>
        <xsl:for-each select="tei:app">
            <xsl:variable name="thisApp" select="."/>
            <xsl:for-each
                select="tokenize(replace(replace($thisBlock, '(.)', '$1\\n'), '\\n$', ''), '\\n')">
                <xsl:choose>
                    <xsl:when test="$thisApp/tei:lem/text() = .">
                    <xsl:copy-of select="$thisApp"></xsl:copy-of>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:apply-templates></xsl:apply-templates>
                </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

The result I want is the following, although I am getting a frightful mess with each <app> containing the variant readings of O printed for every single O in the text, regardless of order (of course, because I don't know how to iterate linearly along two "arrays")...我想要的结果如下,尽管我对每个<app>都感到非常混乱,其中包含为文本中的每个 O 打印的 O 的变体读数,无论顺序如何(当然,因为我不知道如何沿两个“数组”线性迭代)...

<div>
            <p>S<app>
               <lem>O</lem>
               <rdg>Ø</rdg>
            </app>ME<app>
               <lem>W</lem>
               <rdg>V</rdg>
            </app><app>
               <lem>O</lem>
               <rdg>Ö</rdg>
            </app>ND<app>
               <lem>E</lem>
               <rdg>Ë</rdg>
               <rdg>ę</rdg>
            </app>RFULOLDTEXT</p>
         </div>

I suppose this is one way you could look at it:我想这是你可以看待它的一种方式:

XSLT 1.0 XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="div">
    <div>
        <p>
            <xsl:call-template name="process">
                <xsl:with-param name="string" select="p"/>
                <xsl:with-param name="app" select="app"/>
            </xsl:call-template>
        </p>
    </div>  
</xsl:template>

<xsl:template name="process">
    <xsl:param name="string"/>
    <xsl:param name="app"/>
    <xsl:choose>
        <xsl:when test="$app">
            <xsl:variable name="char" select="$app[1]/lem" />
            <xsl:value-of select="substring-before($string, $char)" />
            <xsl:copy-of select="$app[1]"/>
            <!-- recursive call -->
            <xsl:call-template name="process">
                <xsl:with-param name="string" select="substring-after($string, $char)"/>
                <xsl:with-param name="app" select="$app[position()>1]"/>
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$string" />
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

This could probably be smartened up a bit in XSLT 3.0, but I prefer the clarity of the named template method.这可能会在 XSLT 3.0 中变得更聪明一些,但我更喜欢命名模板方法的清晰度。

The use of XSLT 3 and the request for a "recursive" approach while wanting to "iterate" makes me wonder whether xsl:iterate can help: XSLT 3 的使用和在想要“迭代”时对“递归”方法的请求让我想知道xsl:iterate是否可以提供帮助:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xpath-default-namespace="http://www.tei-c.org/ns/1.0"
    xmlns="http://www.tei-c.org/ns/1.0"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="div[p and app]">
    <xsl:copy>
      <p>
        <xsl:iterate select="app">
          <xsl:param name="text" select="p"/>
          <xsl:on-completion select="$text"/>
          <xsl:sequence select="substring-before($text, lem), 
                                .[contains($text, lem)]"/>
          <xsl:next-iteration>
            <xsl:with-param name="text" select="substring-after($text, lem)"/>
          </xsl:next-iteration>
        </xsl:iterate>
      </p>
    </xsl:copy>
  </xsl:template>
  
</xsl:stylesheet>

I don't think you need recursion, if I've understood what you're trying to do.我不认为你需要递归,如果我明白你想做什么的话。 Here's how I might attack the problem:以下是我可能会如何解决这个问题:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs tei" version="3.0"
    xpath-default-namespace="http://www.tei-c.org/ns/1.0">

  <xsl:template match="p/text()">
    <!-- find the apps that applies to the current text node -->
    <xsl:variable name="apps" select="ancestor::div/app"/>
    
    <!-- parse the text node into a sequence of 1-char strings
      This weird trick uses string-to-codepoints() to tokenize 
      the string into a sequence of character codepoints, and
      then uses codepoints-to-string() to turn each integer 
      codepoint back into a 1-char string, yielding a sequence
      of characters 
    -->
    <xsl:variable name="characters" select="
      for $codepoint in 
        string-to-codepoints(.) 
      return 
        codepoints-to-string($codepoint)
    "/>

    <!-- for each character, output a matching app if
      there is one, or otherwise the character itself
    -->
    <xsl:for-each select="$characters">
      <xsl:variable name="character" select="."/>
      <xsl:variable name="app" select="$apps[lem = $character]"/>
      <xsl:choose>
        <xsl:when test="$app">
          <xsl:copy-of select="$app"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="."/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>
  </xsl:template>

  <!-- discard app elements -->
  <xsl:template match="app"/>

  <!-- copy everything else -->
  <xsl:mode on-no-match="shallow-copy"/>

</xsl:stylesheet>

Applied to:应用于:

<div xmlns="http://www.tei-c.org/ns/1.0">
  <p>SOMEWONDERFULOLDTEXT</p>
  <app>
     <lem>O</lem>
     <rdg>Ø</rdg>
  </app>
  <app>
     <lem>W</lem>
     <rdg>V</rdg>
  </app>
  <app>
     <lem>O</lem>
     <rdg>Ö</rdg>
  </app>
  <app>
     <lem>E</lem>
     <rdg>Ë</rdg>
     <rdg>ę</rdg>
  </app>
</div>

Produces result:产生结果:

<div xmlns="http://www.tei-c.org/ns/1.0">
  <p>S<app>
     <lem>O</lem>
     <rdg>Ø</rdg>
  </app><app>
     <lem>O</lem>
     <rdg>Ö</rdg>
  </app>M<app>
     <lem>E</lem>
     <rdg>Ë</rdg>
     <rdg>ę</rdg>
  </app><app>
     <lem>W</lem>
     <rdg>V</rdg>
  </app><app>
     <lem>O</lem>
     <rdg>Ø</rdg>
  </app><app>
     <lem>O</lem>
     <rdg>Ö</rdg>
  </app>ND<app>
     <lem>E</lem>
     <rdg>Ë</rdg>
     <rdg>ę</rdg>
  </app>RFUL<app>
     <lem>O</lem>
     <rdg>Ø</rdg>
  </app><app>
     <lem>O</lem>
     <rdg>Ö</rdg>
  </app>LDT<app>
     <lem>E</lem>
     <rdg>Ë</rdg>
     <rdg>ę</rdg>
  </app>XT</p>
  
  
  
  
</div>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM