简体   繁体   中英

Xpath: Complex expression to include some nodes and exclude others

Example xml:

<foo>
    <bar name="bar1">
    </bar>
    <bar name="bar2">
    </bar>
</foo>
<qux>
    <foo>
        <bar name="bar3">
        </bar>
    </foo>
     <bar name="bar4">
     </bar>
</qux>

What is the expression to select all bar elements that are children of the root foo (bar1, bar2, bar4) but not the nested foo (bar3)?

Thank you in advance!

What is the expression to select all bar elements that are children of the root foo (bar1, bar2, bar4) but not the nested foo (bar3)?

Here is probably one of the simplest and shortest XPath expressions that when evaluated on any well-formed XML document that has a top element foo and may have any level of nested foo elements selects exactly the bar elements whose only foo ancestor is the top element:

//bar[not(ancestor::foo[2])]

This Xpath expression selects any bar element in the document that has less than two foo ancestors. Because by definition the top element is a foo , this means that every bar has this top element foo as an ancestor. If it is within a nested foo , it has at least a second ancestor foo and will not be selected by the above XPath expression, because in this case boolean(ancestor::foo[2]) is true()

XSLT - based verification :

This transformation :

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select=
   "//bar[not(ancestor::foo[2])]"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the following XML document (based on the provided XML fragment, but making it a well-formed XML document and adding a slightly more nesting/complexity, to make this interesting):

<foo>
    <bar name="bar1">
    </bar>
    <bar name="bar2">
    </bar>
    <qux>
        <foo>
           <baz>
             <bar name="bar3">
             </bar>
            </baz>
        </foo>
        <bar name="bar4">
        </bar>
        <qux>
            <foo>
                <bar name="bar5">
                </bar>
            </foo>
            <bar name="bar6">
            </bar>
        </qux>
    </qux>
</foo>

outputs exactly the wanted elements :

<bar name="bar1">

</bar>
<bar name="bar2">

</bar>
<bar name="bar4">

</bar>
<bar name="bar6">

</bar>

As @Cheeso said, the document is invalid, and it doesn't seem to jive with your question.

If this is the document you meant (where qux is inside the first foo )

<foo>
    <bar name="bar1">
    </bar>
    <bar name="bar2">
    </bar>
    <qux>
        <foo>
            <bar name="bar3">
            </bar>
        </foo>
        <bar name="bar4">
        </bar>
    </qux>
</foo>

then here are two paths

//bar[not(parent::foo[ancestor::foo])]
//bar[1 >= count(ancestor::foo)]

that will select the elements you want (tested in .NET).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM