I made a console c# application which is supposed to display the html source of a page.
Instead, the console app is showing HtmlAgilityPack.HtmlDocument
.
Can anyone explain to me why that is?
class Program
{
public HtmlDocument read()
{
HtmlWeb htmlWeb = new HtmlWeb();
try
{
HtmlAgilityPack.HtmlDocument document = htmlWeb.Load("http://www.yahoo.com");
return document;
}
catch (Exception e)
{
Console.WriteLine("Error : " + e.ToString());
return null;
}
}
static void Main(string[] args)
{
Program dis = new Program();
string text = Convert.ToString(dis.read());
Console.WriteLine(text);
Console.ReadLine();
}
}
replace
return document;
with:
return document.DocumentNode.InnerHtml;
or if you wanna to extract text only (without HTML tags):
return document.DocumentNode.InnerText;
the whole code would be:
class Program
{
public string read()
{
HtmlWeb htmlWeb = new HtmlWeb();
try
{
HtmlAgilityPack.HtmlDocument document = htmlWeb.Load("http://www.yahoo.com");
return document.DocumentNode.InnerHtml;
}
catch (Exception e)
{
Console.WriteLine("Error : " + e.ToString());
return null;
}
}
static void Main(string[] args)
{
Program dis = new Program();
string text = dis.read();
Console.WriteLine(text);
Console.ReadLine();
}
}
The default implementation of .ToString()
is just to output the name of the class, which is what you're seeing. So HtmlDocument
from the HtmlAgilityPack obviously doesn't provide a derived implementation.
From glancing at the code over on CodePlex , it looks like you need to use the Save
function to save the output to an XmlWriter
and then use that to get the string. I don't see another way to get at the whole contents of the page directly from that object (though admittedly I just scanned it).
Edit: Amine Hajyoussef pointed you in the right direction with document.DocumentNode.Innerhtml
, though note that you'll need to change the return type of the function as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.