[英]Facing 403 error while Indeed web scraping using python
I need to do the web scraping using request in 'https://in.indeed.com/'.我需要使用“https://in.indeed.com/”中的请求进行 web 抓取。 When I'm running the code it shows the 403 error当我运行代码时,它显示 403 错误
Can anyone tell me the solution..谁能告诉我解决方法。。
url = "https://in.indeed.com"
hdr = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36'}
result = requests.get(url,headers = hdr)
print(result)
I have tried till this to check the status code of the website only it shows error到目前为止,我一直在尝试检查网站的状态代码,但它只显示错误
Note: Need to do the web scraping without using selenium注意:需要在不使用 selenium 的情况下进行 web 抓取
It appears that there are some headers missing in your request.您的请求中似乎缺少一些标头。 I also get a 403 when i do the request like that.当我那样做请求时,我也会得到 403。 However, a copied cURL request works:但是,复制的 cURL 请求有效:
Try the following:尝试以下操作:
(However, i'm assuming the take some measures against web scraping. So you may run in further problems. I'm guessing that you also have to save the cookie or something like this.) (但是,我假设对 web 抓取采取了一些措施。所以你可能会遇到更多问题。我猜你还必须保存 cookie 或类似的东西。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.