简体   繁体   English

如何通过C#获取html页面源代码

[英]how to get html page source by C#

I want to save complete web page asp in local drive by .htm from url or url but I did not success.我想通过.htmurlurl将完整的网页 asp 保存在本地驱动器中,但我没有成功。

Code代码

public StreamReader Fn_DownloadWebPageComplete(string link_Pagesource)
{
     //--------- Download Complete ------------------
     //  using (WebClient client = new WebClient()) // WebClient class inherits IDisposable
     //   {

     //client
     //HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(link_Pagesource);

                    //webRequest.AllowAutoRedirect = true;
                    //var client1 = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(link_Pagesource);
                    //client1.CookieContainer = new System.Net.CookieContainer();


                 //   client.DownloadFile(link_Pagesource, @"D:\S1.htm");

              //  }
         //--------- Download Page Source ------------------
 HttpWebRequest URL_pageSource = (HttpWebRequest)WebRequest.Create("https://www.digikala.com");

                    URL_pageSource.Timeout = 360000;
                    //URL_pageSource.Timeout = 1000000;
                    URL_pageSource.ReadWriteTimeout = 360000;
                   // URL_pageSource.ReadWriteTimeout = 1000000;
                    URL_pageSource.AllowAutoRedirect = true;
                    URL_pageSource.MaximumAutomaticRedirections = 300;

                    using (WebResponse MyResponse_PageSource = URL_pageSource.GetResponse())
                    {

                        str_PageSource = new StreamReader(MyResponse_PageSource.GetResponseStream(), System.Text.Encoding.UTF8);
                        pagesource1 = str_PageSource.ReadToEnd();
                        success = true;
                    }

Error :错误 :

Too many automatic redirections were attempted.尝试了过多的自动重定向。

Attemp by this codes but not successful.通过此代码尝试但未成功。

many url is successful with this codes but this url not successful.许多 url 使用此代码成功,但此 url 不成功。

here is the way这是方法

string url = "https://www.digikala.com/";

using (HttpClient client = new HttpClient())
{
   using (HttpResponseMessage response = client.GetAsync(url).Result)
   {
      using (HttpContent content = response.Content)
      {
         string result = content.ReadAsStringAsync().Result;
      }
   }
}

and result variable will contains the page as HTML then you can save it to a file like this result变量将包含页面作为HTML然后您可以将其保存到这样的文件中

System.IO.File.WriteAllText("path/filename.html", result);

NOTE you have to use the namespace注意你必须使用命名空间

using System.Net.Http;

Update if you are using legacy VS then you can see this answer for using WebClient and WebRequest for the same purpose, but Actually updating your VS is a better solution.如果您使用的是旧版 VS,请进行更新,然后您可以看到使用WebClientWebRequest用于相同目的的答案,但实际上更新您的 VS 是一个更好的解决方案。

using (WebClient client = new WebClient ())
{
    client.DownloadFile("https://www.digikala.com", @"C:\localfile.html");
}
using (WebClient client = new WebClient ())
{
    string htmlCode = client.DownloadString("https://www.digikala.com");
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM