I am trying to extract text and also link which is in href. <html><body><p>foo <a href='http://www.example.com'>bar</a><br> baz</p></body></html>
I am looking for output as, foo http://www.example.com bar baz
br tag should be consider so to get correct formatted sentence.
Here you go:
using System;
using HtmlAgilityPack;
public class Program
{
public static void Main()
{
var html =
@"<html><body><p>foo <a href='http://www.example.com'>bar</a><br> baz</p></body></html> ";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var htmlAnchor = htmlDoc.DocumentNode.SelectSingleNode("//a");
var htmlBr = htmlDoc.DocumentNode.SelectSingleNode("//p");
string hrefValue = htmlAnchor.Attributes["href"].Value;
Console.WriteLine(htmlBr.InnerText + " " + hrefValue);
}
}
Output:
foo bar baz http://www.example.com
Working Example: https://dotnetfiddle.net/BBYAF9
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.