Page redirects in browser but not in python

Question

I'm trying to test sites to see if they redirect from a HTTP to HTTPS. Here is my code.

import requests
url = "http://www.google.com"
page = requests.get(url)
if page.history:
    print ("Request was redirected")
    for resp in page.history:
        print (resp.status_code, resp.url)
    print ("Final destination:")
    print (page.status_code, page.url)
else:
    print (page.headers)
    print (page.history)
    print(page.url)
    print(page.status_code)
    print ("Request was not redirected")

When I test http://www.google.com using various online header checkers I get a 302 redirect to the https site. However, when I run the code above I get a 200 status code and a page result. However, when I run the code with a site like http://fb.com I get the following result.

Request was redirected
301 http://fb.com/
302 http://www.facebook.com/?_rdr
Final destination:
200 https://www.facebook.com/

Is this just some how a Google thing or am I missing something.

Answer 1

Google does a lot of magic based on the user agent string. Try fetching as

page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'})

or with some other user agent string and see if that changes the behaviour.

Also, be aware that if you hit Google with a script, it doesn't take long until you get blocked and see captchas at least, even if you have a real user agent string.

Page redirects in browser but not in python

Question

1 answers

solution1
0 ACCPTED 2016-02-20 12:43:10

Page redirects in browser but not in python

Question

1 answers

solution1 0 ACCPTED 2016-02-20 12:43:10

solution1
0 ACCPTED 2016-02-20 12:43:10