I am trying to get an http response from a website using the requests
module. I get status code 410 in my response:
<Response [410]>
From the documentation, it appears that the forwarding url for the web content may not be intentionally available to the clients. Is this indeed the case, or am I missing something? Trying to confirm if the webpage can be scrapped at all:
url='http://www.b2i.us/profiles/investor/ResLibraryView.asp?ResLibraryID=81517&GoTopage=3&Category=1836&BzID=1690&G=666'
try:
response = requests.get(url)
except requests.exceptions.RequestException as e:
print(e)
Some webisites don't respond well to HTTP requests with 'python-requests' as a User Agent String.
You can get a 200 OK response if you set the User-Agent header to 'Mozilla'.
url='http://www.b2i.us/profiles/investor/ResLibraryView.asp?ResLibraryID=81517&GoTopage=3&Category=1836&BzID=1690&G=666'
headers={'User-Agent':'Mozilla/5'}
response = requests.get(url, headers=headers)
print(response)
< Response [200] >
This works for Mac OSX, but I am having issues with the same approach in Windows on a VMWare virtual machine I run automated tasks from. Why would the behavior be different? Is there a separate workaround for Window machines?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.