简体   繁体   中英

How do I get rid of an error when trying to scrape a site?

When I try to scrape woolworths for grocery price data, I get the following error: urllib.error.HTTPError: HTTP Error 403: Forbidden<\/em>

    session = requests.session()
url = "https://www.woolworths.com.au/shop/browse/fruit-veg"
headers = {"User-Agent":"Mozilla/5.0"}
req = session.get(url, headers = headers)
fruitAndVeg = BeautifulSoup(req.text)

Try using a complete user-agent such as

headers = {"User-Agent" : "Mozilla/5.0 (iPhone; CPU iPhone OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) FxiOS/96.0 Mobile/15E148 Safari/605.1.15"}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM