简体   繁体   中英

How to parse text nodes in xml using XSLT 2.0 functions

Let's say I have this temporary document:

<xsl:variable name="var">
    <root>
        <sentence>Hello world</sentence>
        <sentence>Foo foo</sentence>
    </root>
</xsl:variable>

I am using analyze-string to find "world" string and wrap it with element called <match>

<xsl:function name="my:parse">
    <xsl:param name="input"/>
        <xsl:analyze-string select="$input" regex="world">
            <xsl:matching-substring>
                <match><xsl:copy-of select="."/></match>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:copy-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
</xsl:function>

This function will return:

Hello <match>world</match>Foo foo

Naturally I want this output:

    <root>
        <sentence>Hello <match>world</match></sentence>
        <sentence>Foo foo</sentence>
    </root>

Nevertheless, I know why is my function doing this, but I just can't figure it out how to copy elements and inject them with new content. I know there is issue with context item but I tried so many other ways and nothing works for me.

On the other hand, using template that match //text() works fine. But requirement is to use function (because I am working on multiphase transform and I want to use functions that represent each step). I wonder is there solution to this problem? Am I missing something fundamental?

If you only need to match on one text node at a time, then the way to do this is to do a recursive descent of the tree using template rules, with an identity template for elements, and a template that does your analyze-string for text nodes. Use a mode to separate this from other processing logic, and call apply-templates from your function specifying this mode, so the use of templates is entirely hidden within the implementation of the function.

Here's an illustration of my interpretation of Michael Kay's (+1) suggestion...

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:my="my" exclude-result-prefixes="my">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()" mode="#all" priority="-1">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()" mode="#current"/>
        </xsl:copy>
    </xsl:template>

    <xsl:variable name="var">
        <root>
            <sentence>Hello world</sentence>
            <sentence>Foo foo</sentence>
        </root>
    </xsl:variable>

    <xsl:function name="my:parse">
        <xsl:param name="input"/>
        <xsl:apply-templates select="$input" mode="markup-step"/>
    </xsl:function>

    <xsl:template match="/*">
        <xsl:copy-of select="my:parse($var)"/>
    </xsl:template>

    <xsl:template match="text()" mode="markup-step">
        <xsl:analyze-string select="." regex="world">
            <xsl:matching-substring>
                <match><xsl:copy-of select="."/></match>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:copy-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>        
    </xsl:template>

</xsl:stylesheet>

Output (using any well-formed XML input)

<root>
   <sentence>Hello <match>world</match>
   </sentence>
   <sentence>Foo foo</sentence>
</root>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM