简体   繁体   中英

Xpath select escaped node with a node

I have this xml :

<text>
    blah blah &lt;strong&gt; hello &lt;/strong&gt; more text &lt;strong&gt;hello again&lt;/strong&gt; blah blah
</text>

How do I select the text within the strong tags which have been escaped with &lt and &gt

In this example the selection should be:

  1. hello
  2. hello again

Update needs to be XSLT 1.0

Since you have updated saying you can only use XSLT 1 - See this post: How to use XSLT 1.0 or XPath to manipulate an HTML string

This is a little complex but:

To replace <, >, and & you'll have to clean it three times...

Here is some XSLT to get you started:

<xsl:variable name="cleanXML">
  <xsl:call-template name="SubstringReplace">
    <xsl:with-param name="stringIn">
      <xsl:call-template name="SubstringReplace">
        <xsl:with-param name="stringIn">
          <xsl:call-template name="SubstringReplace">
            <xsl:with-param name="stringIn">
              <xsl:call-template name="SubstringReplace">
                <xsl:with-param name="stringIn" select="$theXml"/>
                <xsl:with-param name="substringIn" select="'&amp;lt;'"/>
                <xsl:with-param name="substringOut" select="'&lt;'"/>
              </xsl:call-template>
            </xsl:with-param>
            <xsl:with-param name="substringIn" select="'&amp;gt;'"/>
            <xsl:with-param name="substringOut" select="'&gt;'"/>
          </xsl:call-template>
        </xsl:with-param>
        <xsl:with-param name="substringIn" select="'&amp;amp;'"/>
        <xsl:with-param name="substringOut" select="'&amp;'"/>
      </xsl:call-template>
    </xsl:with-param>
  </xsl:call-template>
</xsl:variable>

here is a C# implementation.

namespaces used

using System.Xml
using System.Web

Implementation

     //Read xml file
     string xmlText = "<text>blah blah &lt;strong&gt; hello &lt;/strong&gt; more text &lt;strong&gt;hello again&lt;/strong&gt; blah blah</text>";
     System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
     doc.LoadXml(HttpUtility.HtmlDecode(xmlText));
     XmlNodeList Nodes =  doc.GetElementsByTagName("strong");

     List<string> nodeValues= new List<string>();
     foreach (XmlNode Node in Nodes)
     {
         nodeValues.Add(Node.InnerText);
     }               

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM