简体   繁体   English

"尝试抓取网站时如何摆脱错误?"

[英]How do I get rid of an error when trying to scrape a site?

When I try to scrape woolworths for grocery price data, I get the following error: urllib.error.HTTPError: HTTP Error 403: Forbidden<\/em>当我尝试为杂货价格数据抓取毛线时,我收到以下错误: urllib.error.HTTPError: HTTP Error 403: Forbidden<\/em>

Here's what my code looks like:这是我的代码的样子:

    session = requests.session()
url = "https://www.woolworths.com.au/shop/browse/fruit-veg"
headers = {"User-Agent":"Mozilla/5.0"}
req = session.get(url, headers = headers)
fruitAndVeg = BeautifulSoup(req.text)

Try using a complete user-agent such as尝试使用完整的用户代理,例如

headers = {"User-Agent" : "Mozilla/5.0 (iPhone; CPU iPhone OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) FxiOS/96.0 Mobile/15E148 Safari/605.1.15"}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM