python requests handling certain http responses

Question

I am trying to get an http response from a website using the requests module. I get status code 410 in my response:

<Response [410]>

From the documentation, it appears that the forwarding url for the web content may not be intentionally available to the clients. Is this indeed the case, or am I missing something? Trying to confirm if the webpage can be scrapped at all:

url='http://www.b2i.us/profiles/investor/ResLibraryView.asp?ResLibraryID=81517&GoTopage=3&Category=1836&BzID=1690&G=666'

try:
    response = requests.get(url)
 except requests.exceptions.RequestException as e:
    print(e)

Answer 1

Some webisites don't respond well to HTTP requests with 'python-requests' as a User Agent String.
You can get a 200 OK response if you set the User-Agent header to 'Mozilla'.

url='http://www.b2i.us/profiles/investor/ResLibraryView.asp?ResLibraryID=81517&GoTopage=3&Category=1836&BzID=1690&G=666'
headers={'User-Agent':'Mozilla/5'}
response = requests.get(url, headers=headers)
print(response)

< Response [200] >

Answer 2

This works for Mac OSX, but I am having issues with the same approach in Windows on a VMWare virtual machine I run automated tasks from. Why would the behavior be different? Is there a separate workaround for Window machines?

python requests handling certain http responses

Question

2 answers

solution1
2 2017-10-16 16:11:07

solution2
0 2017-11-09 20:38:23

python requests handling certain http responses

Question

2 answers

solution1 2 2017-10-16 16:11:07

solution2 0 2017-11-09 20:38:23

solution1
2 2017-10-16 16:11:07

solution2
0 2017-11-09 20:38:23