简体   繁体   中英

C# extracting a single variable from html document from a website

This is what it looks like. 网页格式

I've tried something like this:

var url = "https://www.tek-zence.no/";
var httpsClient = new HttpClient();
var html = await httpsClient.GetStringAsync(url);

var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);

var element = htmlDocument.DocumentNode.Descendants("div")
    .Where(node => !node.GetAttributeValue("class", "").Contains("feature-nummer")).ToString();
Console.WriteLine(element.Innertext);

Any thoughts?

With HtmlAgilityPack, you can do this:

var text = @"<div><div class='feature-nummer'>01</div></div>";

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(text);

int number = -1;
var div = doc.DocumentNode.SelectSingleNode("//div[contains(@class, 'feature-nummer')]");
if (div != null && int.TryParse(div.InnerText, out int value))
{
    number = value;
}

The HTML here is just a sample like your HTML. Must work with your HTML too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM