Null reference exception when try to get link by class in HtmlAgilityPack

Question

I have asp.net mvc application and html page which I parse using HtmlAgilityPack, but when I try looping my elements I have next error in my foreach: Object reference not set to an instance of an object . My code is next. Does anybody know where is my mistake? I'm new with using htmlagilitypack.

Part of HTML:

<li class="b-serp-item i-bem" onclick="return {&quot;b-serp-item&quot;:{}}">
  <i class="b-serp-item__favicon" style="background-position: 0 -0px"></i>
  <h2 class="b-serp-item__title">
    <b class="b-serp-item__number">1</b>
    <a class="b-serp-item__title-link" href="http://googlescraping.com/google-scraper.php">Google</a>
  </h2>
</li>

CODE

DateTime dt = DateTime.Now;
string dtf = String.Format("{0:u}", dt);
string wp = "page" + dtf + ".html";
HtmlDocument HD = new HtmlDocument();
HD.Load(wp);
string output = "";
foreach (HtmlNode node in HD.DocumentNode.SelectNodes("//a[@class='b-serp-item__title-link']"))
{
    output += node.GetAttributeValue("href", null) + " ";
}

Html output I was shared in google drive: https://drive.google.com/file/d/0B3-m-r5Ce0gOSTlzUGlTT1VBb00/edit?usp=sharing

Answer 1

I ran your code with one slight change, I used HtmlDocument.LoadHtml(stringContents) instead of HtmlDocument.Load(path) and then it works flawlessly.

I suspect that the code is unable to find the file from the path. Ensure that the file exists using File.Exists(wp) and consider using a fully qualified path instead of just the file name by using wp = Path.GetFullPath(wp) .

Or read the contents first using string contents = File.ReadAllText(wp); to grab the contents and then use the LoadHtml method on the HtmlDocument .

Null reference exception when try to get link by class in HtmlAgilityPack

Question

1 answers

solution1
0 ACCPTED 2014-02-06 21:11:03

Null reference exception when try to get link by class in HtmlAgilityPack

Question

1 answers

solution1 0 ACCPTED 2014-02-06 21:11:03

solution1
0 ACCPTED 2014-02-06 21:11:03