使用c＃為不同的瀏覽器獲取網頁的HTML源

Question

我想使用c＃獲取網頁的HTML源代碼，就像使用不同的瀏覽器（例如IE9，Chrome，Firefox）訪問過HTML一樣。 有沒有辦法做到這一點？

Answer 1

您可以通過多種方式獲得HTML源代碼。 我的首選方法是HTML Agility Pack

HtmlDocument doc = new HtmlDocument();
doc.Load("http://domain.com/resource/page.html");
doc.Save("file.htm");

.NET中的WebClient也可以很好地工作。

WebClient myWebClient = new WebClient();
myWebClient.Headers.Add ("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)"); // If you need to simulate a specific browser
byte[] myDataBuffer = myWebClient.DownloadData (remoteUri);
string download = Encoding.ASCII.GetString(myDataBuffer);
// This is verbatim from MSDN... unfortunately their example does not dispose
// of myWebClient (it implements IDisposable).  You should wrap use of a WebClient
// in a using statement.

http://msdn.microsoft.com/zh-CN/library/xz398a3f.aspx

您所獲得的HTML就是您所獲得的。 給定的瀏覽器決定使用它的功能（除非服務器為不同的用戶代理呈現不同的HTML）。

如果確實需要顯式設置用戶代理（以模擬不同的瀏覽器），則以下文章顯示了如何執行此操作：

http://blog.abodit.com/2010/03/a-simple-web-crawler-in-c-using-htmlagilitypack/

（此鏈接還使用HTML Agility Pack實現了一個簡單的Web搜尋器）

Answer 2

我不是C＃專家，但是假設無論哪個“瀏覽器”訪問url，html都是一樣的，您可以使用System.Net.WebClient（如果只需要簡單的控件）或HttpWebRequest（如果需要更高級的控件））

對於WebClient，只需創建一個實例並調用其中的Download *方法之一：

var cli = new WebClient();
string data = cli.DownloadString("http://www.stackoverflow.com");

使用c＃為不同的瀏覽器獲取網頁的HTML源

問題描述

2 個解決方案

解決方案1
2 2012-06-29 03:10:47

解決方案2
1 2012-06-29 03:12:35

使用c＃為不同的瀏覽器獲取網頁的HTML源

問題描述

2 個解決方案

解決方案1 2 2012-06-29 03:10:47

解決方案2 1 2012-06-29 03:12:35

解決方案1
2 2012-06-29 03:10:47

解決方案2
1 2012-06-29 03:12:35