简体   繁体   中英

Related to predicates in HtmlAgilityPack

I want to fetch data from website. I am using HtmlAgilityPack. In the website content is like this

<div id="list">
 <div class="list1">
   <a href="example1.com" class="href1" >A1</a>
   <a href="example4.com" class="href2" />
 </div>
 <div class="list2">
   <a href="example2.com" class="href1" >A2</a>
   <a href="example5.com" class="href2" />
 </div>
 <div class="list3">
   <a href="example3.com" class="href1" >A3</a>
   <a href="example6.com" class="href2" />
 </div>
</div>

Now, I want to fetch the first two links which has class="href1". I am using code.

HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//a[@class='href1'][position()<3]");

But, it is not working. It gives all three links. I want to fetch only first two links. How to do this?

Hey! Now I want to do 1 thing also.

Above, I have only three links with class="href1". Suppose, I have 10 links with class="href1". And I want to fetch only four links from 6th link to 9th link. How to fetch these particular four links?

在应用position()函数之前,请尝试像将锚定选择器包装在括号中一样:

var nodes = doc.DocumentNode.SelectNodes("(//a[@class='href1'])[position()<3]");

Why not just get them all and use the first two from the returned collection? Whatever xpath you would need to do this would be ultimately a hell of a lot less readable than using LINQ:

using System.Linq;
...
HtmlNodeCollection firstTwoHrefs = doc.DocumentNode
    .SelectNodes("//a[@class='href']").Take(2);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM