简体   繁体   中英

How to scrape a page generated with a script in C#?

Simple example: Google search page.

http://www.google.com/search?q=foobar

When I get the source of the page, I get the underlying JavaScript. I want the resulting page. What do I do?

Even though it looks as if it is only javascript it really is the full HTML, you can easily confirm with HtmlAgilityPack :

HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("http://www.google.com/search?q=foobar");
string html = doc.DocumentNode.OuterHtml;
var nodes = doc.DocumentNode.SelectNodes("//div"); //returns 85 nodes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM