简体   繁体   中英

Select specific html tags from a liste of html tag via an xpath selector

I want get some specific information from this html code :

<div class="main">    
    <div class="a"><div><a>linkname1</a></div></div> <!-- I DON'T want get the text of this 'a' tag --> 
    <div class="b">xxx</div>
    <div class="c">xxx</div>
    <div class="a"><div><a>linkname2</a></div></div> <!-- I want get the text of this 'a' tag --> 
    <div class="a"><div><a>linkname3</a></div></div> <!-- I want get the text of this 'a' tag --> 
    <div class="a"><div><a>linkname4</a></div></div> <!-- I want get the text of this 'a' tag -->  
    <div class="a"><div><a>linkname5</a></div></div> <!-- I want get the text of this 'a' tag --> 
    <div class="d"></div>
    <div class="c">xxx</div>
    <div class="a"><div><a>linkname6</a></div></div> <!-- I DON'T want get the text of this 'a' tag --> 
    <div class="a"><div><a>linkname7</a></div></div> <!-- I DON'T want get the text of this 'a' tag --> 
    <div class="a"><div><a>linkname8</a></div></div> <!-- I DON'T want get the text of this 'a' tag --> 
    <div class="d"></div>
    <div class="c">xxx</div>
    <div class="a"><div><a>linkname9</a></div></div> <!-- I DON'T want get the text of this 'a' tag --> 
    <div class="a"><div><a>linkname10</a></div></div> <!-- I DON'T want get the text of this 'a' tag --> 
</div>

I want get in an array the list of the link's text in the 'second' 'a' (class) tags block (between the first div with the class 'c' and the second div with the class 'c') . How can I do that via an xpath selector ? is it possible ? I don't find how do..

With my example, the expected result is :

linkname2
linkname3
linkname4
linkname5

Thank you :)

Your question is a Set question like explained in this SO answer: How to perform set operations in XPath 1.0 .

So applied to your specific situation you should use an intersection like this:

(: intersection :)
$set1[count(. | $set2) = count($set2)]

set1 should be the follow set after div[@class='c'] and
set2 should be the preceding set before div[@class='d'] .

Now, putting both together according to the above formula with

set1 = "div[@class='c'][1]/following-sibling::*" and
set2 = "div[@class='d'][1]/preceding-sibling::*"

the XPath expression could look like this:

div[@class='c'][1]/following-sibling::*[count(. | current()/div[@class='d'][1]/preceding-sibling::*) = count(current()/div[@class='d'][1]/preceding-sibling::*)]

Output:

linkname2
linkname3
linkname4
linkname5

您可以尝试以下表达式:

/div/div[position() > 3 and position() < 8]/div/a/text()

我找到了一种可能的解决方案:)

//following::div[@class='a' and count(preceding::div[@class="c"]) = 1]/div/a/text()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM