简体   繁体   中英

How to detect line breaks in XSLT

I need to be able to output the text of an XML document separated by line breaks. In other words, the XML:

<programlisting>
public static void main(String[] args){
    System.out.println("Be happy!");  
    System.out.println("And now we add annotations.");  
}
</programlisting>

needs to be represented as:

<para>public static void main(String[] args){</para>
<para>    System.out.println("Be happy!"); </para>
<para>    System.out.println("And now we add annotations.");  </para>
<para>}</para>

I thought that I should be able to use substring-before(., '\\n') but for some reason it's not recognizing the line break.

I also tried to output each line as a CDATA section so that I could pull those separately, but ran into the fact that they're all smushed together into a single text node.

I'm just using regular Java here for transformation. Any ideas on how to accomplish this?

Thanks...

As was explained in this answer , all line breaks in XML are treated like the entity &#10; . This means, to split a string at a line break, you have to split at this entity.

Therefore, a solution in plain XSLT 1.0 (without extensions) can look like:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">
  <xsl:output indent="yes"/>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="programlisting/text()">
    <xsl:param name="text" select="."/>
    <para>
      <!-- Because we would rely on $text containing a line break when using 
           substring-before($text,'&#10;') and the last line might not have a
           trailing line break, we append one before doing substring-before().  -->
      <xsl:value-of select="substring-before(concat($text,'&#10;'),'&#10;')"/>
    </para>
    <xsl:if test="contains($text,'&#10;')">
      <xsl:apply-templates select=".">
        <xsl:with-param name="text" select="substring-after($text,'&#10;')"/>
      </xsl:apply-templates>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

With your given XML source, this outputs some empty <para> elements at the first and last linebreak. One could also check for empty lines (like Dimitre does). This however also removes empty lines somewhere in the middle of the code listing. If removing empty lines at the start and end is important while retaining empty lines in the middle, then some more clever approach would be required.

This is just demonstrating that the task is not difficult at all using plain XSLT 1.0.

I. XSLT 2.0 solution :

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="text()">
  <xsl:for-each select="tokenize(., '\n\r?')[.]">
   <para><xsl:sequence select="."></xsl:sequence></para>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<programlisting>
public static void main(String[] args){
    System.out.println("Be happy!");
    System.out.println("And now we add annotations.");
}
</programlisting>

the wanted, correct result is produced:

<programlisting>
   <para>public static void main(String[] args){</para>
   <para>    System.out.println("Be happy!");</para>
   <para>    System.out.println("And now we add annotations.");</para>
   <para>}</para>
</programlisting>

II. XSLT 1.0 solution, using the str-split-to-words template of FXSL :

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
  <xsl:import href="strSplit-to-Words.xsl"/>
  <xsl:output indent="yes" omit-xml-declaration="yes"/>

   <xsl:strip-space elements="*"/>
   <xsl:output indent="yes" omit-xml-declaration="yes"/>

   <xsl:param name="pDelims" select="'&#xA;&#xD;'"/>

    <xsl:template match="/">
      <xsl:variable name="vwordNodes">
        <xsl:call-template name="str-split-to-words">
          <xsl:with-param name="pStr" select="/"/>
          <xsl:with-param name="pDelimiters"
                          select="$pDelims"/>
        </xsl:call-template>
      </xsl:variable>

      <xsl:apply-templates select=
      "ext:node-set($vwordNodes)/*[normalize-space()]"/>
    </xsl:template>

    <xsl:template match="word">
      <para><xsl:value-of select="."/></para>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the same XML document (above), the same correct result is produced :

<para>public static void main(String[] args){</para>
<para>    System.out.println("Be happy!");</para>
<para>    System.out.println("And now we add annotations.");</para>
<para>}</para>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM