简体   繁体   中英

XPATH to check on a specific text within a node

I have this as a node to parse:

<h3 class="atag">
    <a href="http://www.example.com">
      <span class="btag">text to be ignored</span>
         </a>
           <span class="ctag">text to be checked</span>
</h3>

I'm gonna need to extract " http://www.example.com " but not the part text to to be ignored; I also need to check that if ctag contains text to be checked.

I came up with this but it seems it doesn't do the job.

response.xpath("//h3/a/@*[not(self::span)]").extract()

any idea on this?

If you need to just select href from 'a' tag, use @href. To also check, whether the ctag contains some text, I think you can use code like this:

'//h3[contains(span[@class="ctag"]/text(), "text to be checked")]/a/@href'

This would check whether there is a span with "text to be checked" inside given h3 block. If the text exists, the 'www.example.com' would be found, otherwise there would be an empty result.

Do you mean something like this XPath? :

//h3/a[following-sibling::span[@class='ctag' and .='text to be checked']/@href

above XPath get <a> tag that followed by <span class="ctag"> containing value of "text to be checked" , then return href attribute from the previously mentioned <a> tag.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM