I am trying to open a page/link and catch the content in it. It gives me the required content sometimes and throws error sometimes. I see that if I refresh the page a few times - I get the content.
So, I want to reload the page and catch it.
Here's my pseudo code:
attempts = 0
while attempts:
try:
open_page = urllib2.Request(www.xyz.com)
# Or I think we can also do urllib2.urlopen(www.xyz.com)
break
except:
# here I want to refresh/reload the page
attempts += 1
My questions are:
1. How can I reload the page using urllib or urllib2 or requests or mechanize?
2. Can we loop try catch that way?
Thank you!
If you do while attempts
when attempts equal to 0 you will never start the loop. I'd do it backwards, initialize attempts
to equal your desired number of reloads:
attempts = 10
while attempts:
try:
open_page = urllib2.Request('www.xyz.com')
except:
attempts -= 1
else:
attempts = False
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
attempts = 10
retries = Retry(total=attempts,
backoff_factor=0.1,
status_forcelist=[ 500, 502, 503, 504 ])
sess = requests.Session()
sess.mount('http://', HTTPAdapter(max_retries=retries ))
sess.mount('https://', HTTPAdapter(max_retries=retries))
sess.get('http://www.google.co.nz/')
The follow function can refresh after some exception raised or the http response status code is not 200.
def retrieve(url):
while 1:
try:
response = requests.get(url)
if response.ok:
return response
else:
print(response.status)
time.sleep(3)
continue
except:
print(traceback.format_exc())
time.sleep(3)
continue
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.