简体   繁体   中英

XPath expression with condition on multiple ancestors

The application I am developing receives an XML structure similar to following:

<Root>
    <Valid>
        <Child name="Child1" />
        <Container>
            <Child name="Child2" />
        </Container>
        <Container>
            <Container>
                <Child name="Child3"/>
                <Child name="Child4"/>
            </Container>
        </Container>
        <Wrapper>
            <Child name="Child5" />
        </Wrapper>
        <Wrapper>
            <Container>
                <Child name="Child19" />
            </Container>
        </Wrapper>
        <Container>
            <Wrapper>
                <Child name="Child6" />
            </Wrapper>
        </Container>
        <Container>
            <Wrapper>
                <Container>
                    <Child name="Child20" />
                </Container>
            </Wrapper>
        </Container>
    </Valid>
    <Invalid>
        <Child name="Child7" />
        <Container>
            <Child name="Child8" />
        </Container>
        <Container>
            <Container>
                <Child name="Child9"/>
                <Child name="Child10"/>
            </Container>
        </Container>
        <Wrapper>
            <Child name="Child11" />
        </Wrapper>
        <Container>
            <Wrapper>
                <Child name="Child12" />
            </Wrapper>
        </Container>
    </Invalid>
</Root>

I need to get a list of of Child elements under following conditions:

  1. Child is n generation descendant of Valid ancestor.
  2. Child may be m generation descendant of Container ancestor which is o generation descendant of Valid ancestor.
  3. Valid ancestors for Child element are Container elements as m generation ancestors and Valid element as first generation ancestor.

where m, n, o are natural numbers.

I need to write following XPath expressions

Valid/Child
Valid/Container/Child
Valid/Container/Container/Child
Valid/Container/Container/Container/Child
...

as a single XPath expression.

For provided example, the XPath expression would return only Child elements having name attribute equal to Child1 , Child2 , Child3 and Child4 .

The closest I have come to solution is following expression.

Valid/Child | Valid//*[self::Container]/Child

However, this would select Child element with name attribute equal to Child19 and Child20 .

Does XPath syntax supports either optional occurrence of an element or setting condition similar to self in previous example to all ancestors between Child and Valid elements?

Use :

//Child[ancestor::*
          [not(self::Container)][1]
                            [self::Valid]
       ]

When this XPath expression is evaluated on the provided XML document:

<Root>
    <Valid>
        <Child name="Child1" />
        <Container>
            <Child name="Child2" />
        </Container>
        <Container>
            <Container>
                <Child name="Child3"/>
                <Child name="Child4"/>
            </Container>
        </Container>
        <Wrapper>
            <Child name="Child5" />
        </Wrapper>
        <Wrapper>
            <Container>
                <Child name="Child19" />
            </Container>
        </Wrapper>
        <Container>
            <Wrapper>
                <Child name="Child6" />
            </Wrapper>
        </Container>
        <Container>
            <Wrapper>
                <Container>
                    <Child name="Child20" />
                </Container>
            </Wrapper>
        </Container>
    </Valid>
    <Invalid>
        <Child name="Child7" />
        <Container>
            <Child name="Child8" />
        </Container>
        <Container>
            <Container>
                <Child name="Child9"/>
                <Child name="Child10"/>
            </Container>
        </Container>
        <Wrapper>
            <Child name="Child11" />
        </Wrapper>
        <Container>
            <Wrapper>
                <Child name="Child12" />
            </Wrapper>
        </Container>
    </Invalid>
</Root>

Exactly the wanted nodes are selected:

<Child name="Child1"/>
<Child name="Child2"/>
<Child name="Child3"/>
<Child name="Child4"/>

Explanation :

The expression:

//Child[ancestor::*
          [not(self::Container)][1]
                            [self::Valid]
       ]

means :

From all Child elements in the document, select only those, for which the first ancestor that is not a Container is Valid .

//Valid
 //Child[count(ancestor::Container[ancestor::Valid])
          = count(ancestor::*[ancestor::Valid])]

Explanation:

//Valid//Child

Returns all Child nodes that are descendants of Valid nodes.

count(ancestor::Container[ancestor::Valid]])

Returns the number of Container tags that are ancestors of the current node ( Child ) and themselves have an ancestor called Valid

count(ancestor::*[ancestor::Valid])

Returns the number of all tags that are ancestors of the current node ( Child ) and themselves have an ancestor called Valid

Therefore two values are only equal if all tags between Valid and Child are called Container .

However, this expression assumes that there won't be any nested Valid tags, ie /Valid/Valid/Child will not be accepted by it.

Update: Looking at your xml one more time, wouldn't this be easier?

//Valid//Child[not(ancestor::Wrapper)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM