简体   繁体   中英

Can we reload a page/url in python using urllib or urllib2 or requests or mechanize?

I am trying to open a page/link and catch the content in it. It gives me the required content sometimes and throws error sometimes. I see that if I refresh the page a few times - I get the content.

So, I want to reload the page and catch it.

Here's my pseudo code:

attempts = 0
while attempts:
    try:
        open_page = urllib2.Request(www.xyz.com)
        # Or I think we can also do urllib2.urlopen(www.xyz.com)
        break
    except: 
        # here I want to refresh/reload the page
        attempts += 1


My questions are:
1. How can I reload the page using urllib or urllib2 or requests or mechanize?
2. Can we loop try catch that way?

Thank you!

If you do while attempts when attempts equal to 0 you will never start the loop. I'd do it backwards, initialize attempts to equal your desired number of reloads:

attempts = 10
while attempts:
    try:
        open_page = urllib2.Request('www.xyz.com')
    except: 
        attempts -= 1
    else:
        attempts = False
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

attempts = 10

retries = Retry(total=attempts,
            backoff_factor=0.1,
            status_forcelist=[ 500, 502, 503, 504 ])

sess = requests.Session()
sess.mount('http://', HTTPAdapter(max_retries=retries ))
sess.mount('https://', HTTPAdapter(max_retries=retries))
sess.get('http://www.google.co.nz/')

The follow function can refresh after some exception raised or the http response status code is not 200.

def retrieve(url):
    while 1:
        try:
            response = requests.get(url)
            if response.ok:
                return response
            else:
                print(response.status)
                time.sleep(3)
                continue
        except:
            print(traceback.format_exc())
            time.sleep(3)
            continue

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM