简体   繁体   English

C#HTML Agility Pack单一选择节点返回null

[英]C# HTML Agility Pack Single Select Node returning null

I have a web scraper developed using C#, windows forms and the HTML Agility Pack. 我有一个使用C#,Windows窗体和HTML Agility Pack开发的Web抓取工具。

I had it all working great when the site changed it's code and broke it. 当站点更改代码并破坏代码时,我的所有工作都很好。 I know it happens often with web scrapers but now I am having trouble figuring out how to correct the issue. 我知道它经常发生在刮板机上,但是现在我很难弄清楚如何解决此问题。

At this time my scraper loops thru multiple URL's and scrapes data from each page. 这时,我的抓取工具会循环遍历多个URL,并从每个页面抓取数据。

The problem I am running into is that the template of the site it loops thru will randomly show the newer template which does not have the same HTML classes and ID's that I have defined in the program. 我遇到的问题是,它循环通过的网站模板将随机显示较新的模板,该模板不具有我在程序中定义的相同的HTML类和ID。 What I am trying to do is run a simple if that checks if a single node if null and if it is runs a separate set of code for the new template. 我想做的是运行一个简单的if,它检查单个节点是否为null,以及是否为新模板运行单独的代码集。

The problem I am having is that my program throws a NullReferenceException on my if statement. 我遇到的问题是我的程序在if语句上引发了NullReferenceException。

Here is the statement I am using to check if it is null: 这是我用来检查它是否为空的语句:

var varitem = doc.DocumentNode.SelectSingleNode("//h1[@class='producttitle']").InnerText;

 if (varitem == null) MessageBox.Show("no titles");

It throws the exception at the first line defining the varitem and doesn't even make it to the if statement. 它将在定义变量的第一行引发异常,甚至不会将其传递给if语句。

Any advise appreciated! 任何建议表示赞赏!

try below 试试下面

var varitem = doc.DocumentNode.SelectSingleNode("//h1[@class='producttitle']");

SelectSingleNode can return null and also you better check whether InnerText also not null or empty as well SelectSingleNode可以返回null,您还可以更好地检查InnerText是否也不为null或为空

if (varitem == null || string.IsNullOrEmpty(varitem.InnerText))
              MessageBox.Show("no titles");

First you should check whether 首先,您应该检查是否

 doc.DocumentNode.SelectSingleNode("//h1[@class='producttitle']")

returns null. 返回null。

If it is null you'll get the the NullReferenceException from null.InnerText 如果为null,则从null.InnerText获取NullReferenceException

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM