简体   繁体   中英

I'm using htmlagilitypack to extract some data from a website but I can't figure out what issue happen?

string Url = "https://www.rottentomatoes.com/browse/dvd-all/?services=netflix_iw";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlWeb().Load(Url);

foreach ( var node in htmlDoc.DocumentNode.SelectNodes("/html/body[@class='body  ']/div[@class='body_main container']/div[@id='main_container']/div[@id='main-row']/div[@id='content-column']/div[@id='movies-collection']/div[@class='mb-movies list-view']/div[@class='mb-movie']"))
{
    string movieTitle = node.InnerText;
    richTextBox1.Text += movieTitle + System.Environment.NewLine;
}

I want to extract all movies title from this URL navigating XPath. VS says that I have no object reference. Why? Can you try for me in this particulary case?

The following piece of code worked for me:

string Url = "https://www.rottentomatoes.com/browse/dvd-all/?services=netflix_iw";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlWeb().Load(Url);

IEnumerable<string> movieTitles = from movieNode in htmlDoc.DocumentNode.Descendants()
                                  where movieNode.GetAttributeValue("class", "").Equals("movieTitle")
                                  select movieNode.InnerHtml;

It uses LINQ to access the nodes containing the movie title.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM