當 URL 不存在時 Python 請求模塊中的錯誤處理

Question

我正在嘗試確定 python 中請求模塊的錯誤處理，以便在 URL 不可用時收到通知，即 HTTPError、ConnectionError、Timeout 等...

我遇到的問題是，即使在假 URL 上，我似乎也收到了 200 的狀態響應

我已經瀏覽了 SO 和其他各種網絡資源，嘗試了許多不同的方法來似乎試圖實現相同的目標，但到目前為止都是空的。

我已經將代碼簡化為基本的代碼，以簡化事情。

import requests

urls = ['http://fake-website.com', 
        'http://another-fake-website.com',
        'http://yet-another-fake-website.com',
        'http://google.com']

for url in urls:
    r = requests.get(url,timeout=1)
    try:
        r.raise_for_status()
    except:
        pass
    if r.status_code != 200:
        print ("Website Error: ", url, r)
    else:
        print ("Website Good: ", url, r)

我希望列表中的前 3 個 URL 被歸類為'Website Error:'因為它們是我剛剛創建的 URL。 列表中的最終 URL 顯然是真實的，因此應該是唯一一個被列為'Website Good:' URL。

發生的事情是第一個 URL 對代碼產生了正確的響應，因為它給出了 503 的響應代碼，但根據https://httpstatus.io/ ，接下來的兩個 URL 根本不產生status_code ，但只顯示ERROR與Cannot find URI. another-fake-website.com another-fake-website.com:80 Cannot find URI. another-fake-website.com another-fake-website.com:80

所以我希望列表中除了最后一個 URL 之外的所有 URL 都顯示為'Website Error:'

輸出

在 Raspberry Pi 中運行腳本時

Python 2.7.9 (default, Sep 26 2018, 05:58:52) 
[GCC 4.9.2] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
('Website Error: ', 'http://fake-website.com', <Response [503]>)
('Website Good: ', 'http://another-fake-website.com', <Response [200]>)
('Website Good: ', 'http://yet-another-fake-website.com', <Response [200]>)
('Website Good: ', 'http://google.com', <Response [200]>)
>>>

如果我在https://httpstatus.io/輸入所有 4 個 URL，我會得到以下結果：

它顯示了一個 503、一個 200 和兩個沒有狀態代碼而只是顯示錯誤的 URL

更新

所以我想我會使用 PowerShell 在 Windows 中檢查這個並遵循這個例子： https : //stackoverflow.com/a/52762602/5251044

這是下面的輸出

c:\Testing>powershell -executionpolicy bypass -File .\AnyName.ps1
0 - http://fake-website.com
200 - http://another-fake-website.com
200 - http://yet-another-fake-website.com
200 - http://google.com

正如你所看到的，我不再向前了。

更新 2

在與Fozoro HERE進行了進一步討論並嘗試了各種選項但沒有修復的情況下，我想我會嘗試使用urllib2而不是requests代碼

這是更改后的代碼

from urllib2 import urlopen
import socket

urls = ['http://another-fake-website.com',
        'http://fake-website.com',
        'http://yet-another-fake-website.com',
        'http://google.com',
        'dskjhkjdhskjh.com',
        'doioieowwros.com']

for url in urls:

    try:
        r  = urlopen(url, timeout = 5)
        r.getcode()
    except:
        pass
    if r.getcode() != 200:
        print ("Website Error: ", url, r.getcode())
    else:
        print ("Website Good: ", url, r.getcode())

不幸的是，結果輸出仍然不正確，但與之前代碼的輸出略有不同，見下文：

Python 2.7.9 (default, Sep 26 2018, 05:58:52) 
[GCC 4.9.2] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
('Website Good: ', 'http://another-fake-website.com', 200)
('Website Good: ', 'http://fake-website.com', 200)
('Website Good: ', 'http://yet-another-fake-website.com', 200)
('Website Good: ', 'http://google.com', 200)
('Website Good: ', 'dskjhkjdhskjh.com', 200)
('Website Good: ', 'doioieowwros.com', 200)
>>>

這次它顯示了所有200響應，非常奇特。

Answer 1

您應該將r = requests.get(url,timeout=1)放在try:塊中。 所以你的代碼需要看起來像這樣：

import requests

urls = ['http://fake-website.com', 
        'http://another-fake-website.com',
        'http://yet-another-fake-website.com',
        'http://google.com']

for url in urls:
    try:
        r = requests.get(url,timeout=1)
        r.raise_for_status()
    except:
        pass
    if r.status_code != 200:
        print ("Website Error: ", url, r)
    else:
        print ("Website Good: ", url, r)

輸出：

Website Error:  http://fake-website.com <Response [503]>
Website Error:  http://another-fake-website.com <Response [503]>
Website Error:  http://yet-another-fake-website.com <Response [503]>
Website Good:  http://google.com <Response [200]>

我希望這有幫助！

Answer 2

對我來說，原因原來是我的 ISP 提供的關於 URL 無效的網站 - 是該網站返回 200，而不是假的。

這可以通過使用requests.get('http://fakesite').text打印返回站點的內容來驗證

當 URL 不存在時 Python 請求模塊中的錯誤處理

問題描述

2 個解決方案

解決方案1
2 2019-04-07 15:34:29

解決方案2
1 2021-01-01 21:13:19

當 URL 不存在時 Python 請求模塊中的錯誤處理

問題描述

2 個解決方案

解決方案1 2 2019-04-07 15:34:29

解決方案2 1 2021-01-01 21:13:19

解決方案1
2 2019-04-07 15:34:29

解決方案2
1 2021-01-01 21:13:19