简体   繁体   中英

xpath: How to combine multiple conditions on different axes

I try to extract all links based on these three conditions:

  • Must be part of <div data-test="cond1">
  • Must have a <a href="..." class="cond2">
  • Must not have a <img src="..." class="cond3">

The result should be "/product/1234".

<div data-test="test1">
  <div>
    <div data-test="cond1">
      <a href="/product/1234" class="cond2">Link 1</a>
      <div class="test4">
        <div class="test5">
          <div class="test6">
            <div class="test7">
              <div class="test8">
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>
<div data-test="test2">
  <div>
    <div data-test="cond1">
      <a href="/product/5678" class="cond2">Link 2</a>
      <div class="test4">
        <div class="test5">
          <div class="test6">
            <div class="test7">
              <div class="test8">
                <img src="bild.jpg" class="cond3">
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>

I'm able to extract the links with the following xpath query.

//div[starts-with(@data-test,"cond")]/a[starts-with(@class,"cond")]/@href

(I know the first part is not really neccessary. But better safe than sorry.)

But I'm still struggling with excluding the links containing an descendant img tag and how to add it to the query above.

This should do what you want:

//div[@data-test="cond1" and not(.//img[@class="cond3"])]
/a[@class="cond2"]
/@href
/product/1234

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM