如何在 C# 中驗證 URL 以避免 404 錯誤？

Question

我需要編寫一個工具來報告 C# 中損壞的 URL。 僅當用戶在瀏覽器中看到 404 錯誤時，該 URL 才應報告已損壞。 我相信可能有一些技巧可以處理進行 URL 重寫的 Web 服務器。 這就是我所擁有的。 正如您所看到的，只有一些 URL 驗證不正確。

string url = "";

// TEST CASES
//url = "http://newsroom.lds.org/ldsnewsroom/eng/news-releases-stories/local-churches-teach-how-to-plan-for-disasters";   //Prints "BROKEN", although this is getting re-written to good url below.
//url = "http://beta-newsroom.lds.org/article/local-churches-teach-how-to-plan-for-disasters";  // Prints "GOOD"
//url = "http://";     //Prints "BROKEN"
//url = "google.com";     //Prints "BROKEN" althought this should be good.
//url = "www.google.com";     //Prints "BROKEN" althought this should be good.
//url = "http://www.google.com";     //Prints "GOOD"

try
{

    if (url != "")
    {
        WebRequest Irequest = WebRequest.Create(url);
        WebResponse Iresponse = Irequest.GetResponse();
        if (Iresponse != null)
        {
            _txbl.Text = "GOOD";
        }
    }
}
catch (Exception ex)
{
    _txbl.Text = "BROKEN";
}

Answer 1

一方面， Irequest和Iresponse不應該這樣命名。 它們應該只是webRequest和webResponse ，或者甚至只是request和response 。 大寫的“I”前綴通常僅用於接口命名，而不用於實例變量。

要進行 URL 有效性檢查，請使用UriBuilder獲取Uri 。 然后您應該使用HttpWebRequest和HttpWebResponse以便您可以檢查強類型狀態代碼響應。 最后，您應該對損壞的內容有更多的了解。

以下是我介紹的一些其他 .NET 內容的鏈接：

樣品：

try
{
    if (!string.IsNullOrEmpty(url))
    {
        UriBuilder uriBuilder = new UriBuilder(url);
        HttpWebRequest request = HttpWebRequest.Create(uriBuilder.Uri);
        HttpWebResponse response = request.GetResponse();
        if (response.StatusCode == HttpStatusCode.NotFound)
        {
            _txbl.Text = "Broken - 404 Not Found";
        }
        if (response.StatusCode == HttpStatusCode.OK)
        {
            _txbl.Text =  "URL appears to be good.";
        }
        else //There are a lot of other status codes you could check for...
        {
            _txbl.Text = string.Format("URL might be ok. Status: {0}.",
                                       response.StatusCode.ToString());
        }
    }
}
catch (Exception ex)
{
    _txbl.Text = string.Format("Broken- Other error: {0}", ex.Message);
}

Answer 2

在 URL 前添加http://或https://並將其傳遞給WebClient.OpenRead方法。 如果 URL 格式WebException ，它將拋出WebException 。

  private WebClient webClient = new WebClient();

  try {
        Stream strm = webClient.OpenRead(URL);                                   
    }
    catch (WebException we) {
        throw we;
    }

Answer 3

問題是，我相信，大多數“應該是好的”案例實際上是在瀏覽器級別處理的。 如果您省略“http://”，則它是一個無效請求，但瀏覽器會為您放入。

所以也許你可以做一個瀏覽器會做的類似檢查：

確保開頭有一個“http://”
確保有一個“www”。 一開始

Answer 4

使用RegEx ...

public static bool IsUrl(string Url) 
{ 
    string strRegex = "^(https?://)" 
    + "?(([0-9a-z_!~*'().&=+$%-]+: )?[0-9a-z_!~*'().&=+$%-]+@)?" //user@ 
    + @"(([0-9]{1,3}\.){3}[0-9]{1,3}" // IP- 199.194.52.184 
    + "|" // allows either IP or domain 
    + @"([0-9a-z_!~*'()-]+\.)*" // tertiary domain(s)- www. 
    + @"([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\." // second level domain 
    + "[a-z]{2,6})" // first level domain- .com or .museum 
    + "(:[0-9]{1,4})?" // port number- :80 
    + "((/?)|" // a slash isn't required if there is no file name 
    + "(/[0-9a-z_!~*'().;?:@&=+$,%#-]+)+/?)$";  
    Regex re = new Regex(strRegex); 

    if (re.IsMatch(Url)) 
        return (true); 
    else 
        return (false); 
}

從這里拉出來： http ： //www.osix.net/modules/article/？id = 586

如果你查看，有很多不同的正則表達式，如鏈接文本

如何在 C# 中驗證 URL 以避免 404 錯誤？

問題描述

3 個解決方案

解決方案1
8 2010-09-29 22:38:32

解決方案2
0 2010-09-29 22:38:28

解決方案3
-1 2010-09-29 22:35:43

解決方案4
-1 2010-09-29 22:46:25

如何在 C# 中驗證 URL 以避免 404 錯誤？

問題描述

3 個解決方案

解決方案1 8 2010-09-29 22:38:32

解決方案2 0 2010-09-29 22:38:28

解決方案3 -1 2010-09-29 22:35:43

解決方案4 -1 2010-09-29 22:46:25

解決方案1
8 2010-09-29 22:38:32

解決方案2
0 2010-09-29 22:38:28

解決方案3
-1 2010-09-29 22:35:43

解決方案4
-1 2010-09-29 22:46:25