簡體   English   中英

您無權訪問此資源 Python webscraping

[英]You don't have permission to access this resource Python webscraping

我正在嘗試網絡抓取一個網站,當我這樣做時,我的輸出低於輸出。 有什么辦法可以抓取這個網站嗎?

url = "https://www.mustang6g.com/forums/threads/pre-collision-alert-system.132807/"

page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
print(soup)

上面代碼的輸出如下

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access this resource.</p>
</body></html>

網站服務器希望傳遞一個標頭:

import requests

headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) '\
           'AppleWebKit/537.36 (KHTML, like Gecko) '\
           'Chrome/75.0.3770.80 Safari/537.36'}

URL = 'https://www.mustang6g.com/forums/threads/pre-collision-alert-system.132807/'


httpx = requests.get(URL, headers=headers)

print(httpx.text)

通過傳遞標頭,我們告訴服務器我們是 Mozilla :)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM