[英]Why does my urllib.request return a http error 403?
我正在嘗試制作一個使用 python 從站點下載一系列產品圖片的程序。 該網站以某種 url 格式https://www.sitename.com/XYZabcde存儲其圖像,其中 XYZ 是代表產品品牌的三個字母,abcde 是 00000 到 30000 之間的一系列數字。這是我的代碼:
import urllib.request
def down(i, inp):
full_path = 'images/image-{}.jpg'.format(i)
url = "https://www.sitename.com/{}{}.jpg".format(inp,i)
urllib.request.urlretrieve(url, full_path)
print("saved")
return None
inp = input("brand :" )
i = 20100
while i <= 20105:
x = str(i)
y = x.zfill(5)
z = "https://www.sitename.com/{}{}.jpg".format(inp,y)
print(z)
down(y, inp)
i += 1
使用我編寫的代碼,我可以從中成功下載一系列我知道存在的圖片,例如從 20100 年到 20105 年的品牌 RVL 將成功下載這六張圖片。 但是,當我擴大 while 循環以包含我不知道的鏈接時會給我一個圖像,我收到此錯誤代碼:
Traceback (most recent call last):
File "c:/Users/euan/Desktop/university/programming/Python/parser/test - Copy.py", line 20, in <module>
down(y, inp)
File "c:/Users/euan/Desktop/university/programming/Python/parser/test - Copy.py", line 6, in down
urllib.request.urlretrieve(url, full_path)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
我能做些什么來檢查和避免任何會產生這個結果的網址?
因此,您無法提前知道您無權訪問哪些 URL,但您可以使用 try-except 來包圍下載:
import urllib.request, urllib.error
...
def down(i, inp):
full_path = 'images/image-{}.jpg'.format(i)
url = "https://www.sitename.com/{}{}.jpg".format(inp,i)
try:
urllib.request.urlretrieve(url, full_path)
print("saved")
except urllib.error.HTTPError as e:
print("failed:", e)
return None
在這種情況下,它只會在無法獲取 URL 時打印例如“失敗:HTTP 錯誤 403:禁止”,並且程序將繼續。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.