简体   繁体   中英

Conditions on recursive XPath

How can I use recursive AND conditional selection in XPath?

For example, given this document:

<root xmlns:foo="http://www.foo.org/" xmlns:bar="http://www.bar.org">
  <file name="foo.mp4">
    <chunks>
      <file>
        <chunks>
          <file>
          <chunks>
            <file>1</file>
            <file>2</file>
            <file>3</file>
            <file>4</file>
          </chunks>
          </file>
          <file>
          <chunks>
            <file>5</file>
            <file>6</file>
            <file>7</file>
            <file>8</file>
          </chunks>
          </file>
        </chunks>
      </file>
      <file>
        <chunks>
          <file>
          <chunks>
            <file>9</file>
            <file>10</file>
            <file>11</file>
            <file>12</file>
          </chunks>
          </file>
          <file>
          <chunks>
            <file>13</file>
            <file>14</file>
            <file>15</file>
            <file>16</file>
          </chunks>
          </file>
        </chunks>
      </file>
    </chunks>
  </file>
</root>

I would like to select just:

<file>1</file>
<file>2</file>
<file>3</file>
<file>4</file>

So, effectively this:

//[name="foo.mp4"]/chunks/*[1]/chunks/*[1]/*

But with a generalized approach -- ie something that would cover even deeper-nested objects. Something like this:

//[name="foo.mp4"]/(chunks/*[1]/)+/*

(cond)+ is not XPath syntax, and a regex-like representation of what I want.

Recursion implies self-reference and is not directly available in XPath. The usual way to ignore intervening levels of elements is via the descendant-or-self axis ( // ), anchored by a desired property.

For example, each of the following XPath expressions,

  • All file elements with values less than 5:

     //file[number() < 5] 
  • The first 4 leaf file elements:

     //file[not(*)][count(preceding::file[not(*)]) < 4] 
  • The file leaf elements whose ancestors have no predecessors:

     //file[not(*)][not(ancestor::*[preceding::*])] 

will select

<file>1</file>
<file>2</file>
<file>3</file>
<file>4</file>

as requested.

There is no such thing as recursive XPath as far as I know. So you'll need to combine XPath with some other things like XSLT or a programming language to be able to do recursion. Using pure XPath, you'll need to formulate the requirement differently, if possible.

I don't know if this is applicable to your actual data, but if you can formulate the requirement to something like the following, for example :

"within file[@name='foo.mp4'] , find the first <chunk> that contains leaf <file> ie <file> element that doesn't contain any element, only text nodes, and return the leaf <file> elements"

then there will be a possible pure XPath solution :

(//file[@name='foo.mp4']//chunks[not(file/*)])[1]/file

given sample XML in question, the expected output of file 1 to 4 are returned by the above XPath expression when tested here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM