简体   繁体   中英

Remove Namespace and Extract a subset of XML file using XSL

When my Input Xml as:

 <country>
       <state>
           <city>
               <name>DELHI</name>            
           </city>
      </state>
    </country>

For required output as below:

<city>
  <name>DELHI</name>            
</city

The following xsl is working fine:

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes" omit-xml-declaration="yes" />
    <xsl:template match="/">
        <xsl:copy-of select="//city">
        </xsl:copy-of>
    </xsl:template>
</xsl:stylesheet>

BUT THE SAME XSL IS NOT WORKING FOR THE ABOVE INPUT XML, IF NAME SPACE IS ADDED: Like Below:

<country xmlns="http://india.com/states" version="1.0">
   <state>
       <city>
           <name>DELHI</name>            
       </city>
  </state>
</country>

I want the name space to be removed along with the city element to be copied.

Any help would be appreciated. Thanks

This is the most FAQ on XPath, XML and XSLT. Search for "default namespace and XPath expressions".

As for a solution :

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:x="http://india.com/states">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="*">
  <xsl:element name="{name()}">
   <xsl:copy-of select="@*"/>
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>


 <xsl:template match="*[not(ancestor-or-self::x:city)]">
  <xsl:apply-templates/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document :

<country xmlns="http://india.com/states" version="1.0">
    <state>
        <city>
            <name>DELHI</name>
        </city>
    </state>
</country>

the wanted result is produced :

<city>
   <name>DELHI</name>
</city>

Explanation :

  1. In XPath an unprefixed element-name is always considerd to be in "no namespace". However, every element name in the provided XML document is in a non-empty namespace (the default namespace "http://india.com/states" ). Therefore, //city selects no node (as there is no element in the XML document that is no namespace), while //x:city where x: is bound to the namespace "http://india.com/states" selects all city elements (that are in the namespace "http://india.com/states" ).

  2. In this transformation there are two templates. The first template matches any element and re-creates it, but in no-namespace. It also copies all atributes and then applies templates to the children-nodes of this element.

  3. The second template overrides the first for all elements that are not ancestors of a city element or not themselves a city element. The action here is to apply templates on all children nodes.

UPDATE : The OP has modified the question asking why there is non-wanted text in the result of processing a new, modified XML document:

<country xmlns="http://india.com/states" version="1.0">
        <state>
            <city>
                <name>DELHI</name>
            </city>
        </state>
        <state2>
            <city2>
                <name2>MUMBAI</name2>
            </city2>
        </state2>
</country>

In order not to produce the text "MUMBAI", the transformation above needs to be slightly modified -- to ignore (not copy) any text node that hasn't an x:city ancestor. For this purpose, we add the following one-line, empty template:

 <xsl:template match="text()[not(ancestor::x:city)]"/>

The whole transformation now becomes :

<xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:x="http://india.com/states">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>

     <xsl:template match="*">
      <xsl:element name="{name()}">
       <xsl:copy-of select="@*"/>
       <xsl:apply-templates/>
      </xsl:element>
     </xsl:template>

     <xsl:template match="*[not(ancestor-or-self::x:city)]">
      <xsl:apply-templates/>
     </xsl:template>

     <xsl:template match="text()[not(ancestor::x:city)]"/>
</xsl:stylesheet>

and the result is still the wanted, correct one :

<city>
   <name>DELHI</name>
</city>

You can get the wanted output by using a template like:

 <xsl:template match="*[not(ancestor-or-self::x:*[starts-with(name(),'city')])]">
  <xsl:apply-templates/>
 </xsl:template>

or

 <xsl:template match="/">
     <xsl:apply-templates select="//x:*[starts-with(name(),'city')]"/>
 </xsl:template>

Tested with Microsoft (R) XSLT Processor Version 4.0 on your new input it gives:

<city>
   <name>DELHI</name>
</city>
<city2>
   <name2>MUMBAI</name2>
</city2>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM