scraping yell with python requests gives 403 error

Question

I have this code

from requests.sessions import Session
url = "https://www.yell.com/s/launderettes-birmingham.html"

s = Session()
headers = {
    'user-agent':"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36",
}
r = s.get(url,headers=headers)
print(r.status_code)

but I get 403 output, instead 200

I can scrape this data with selenium, but is there a way to scrape this with requests

Answer 1

If you modify your code like so:

print(r.text)
print(r.status_code)

you will see, that the reason you are getting a 400 error code is due to yell using Cloudflare browser check.

As it uses javascript, there is no way to reliably use the requests module.

Since you mentioned you are going to use selenium, make sure to use the undetected driver package Also, be sure to rotate your IP to avoid getting your IP blocked.

scraping yell with python requests gives 403 error

Question

1 answers

solution1
1 2022-05-13 10:48:00

scraping yell with python requests gives 403 error

Question

1 answers

solution1 1 2022-05-13 10:48:00

solution1
1 2022-05-13 10:48:00