简体   繁体   中英

python urllib2 timeout

Ok guys, i've search in google and here in stackoverflow for this answer and after a few hours did not see a correct answer of a working script to do this....

Here i paste 4 examples of supposed python working scripts to set a default timeout for a non-exist url with a timeout set with sockets and/or the timeout param.

No one works so the timeout is never triggered.

Any ideas?

First exmaple:

import urllib2

try:                
    header_s = {"User-Agent":"Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11"}

    req = urllib2.Request("http://www.nonexistantdomainurl.com/notexist.php",headers = header_s)


    print urllib2.urlopen(req, None, 5.0).read()

except urllib2.URLError, e:
    print "Url Error: %r" % e

except Exception,e:
  print "Fallo de tipo ",e

else: 
    print "all ok!"

Second example:

import urllib2

try:
    response = urllib2.urlopen("http://www.nonexistantdomainurl.com/notexist.php", None, 2.5)
except urllib2.URLError, e:
    print "Oops, timed out?"

Thrid example:

from urllib2 import Request, urlopen, URLError, HTTPError
import base64


req = Request('http://www.nonexistantdomainurl.com/notexist.php')

try:
    response = urlopen(req,timeout=5.0)   

except HTTPError, e:
    print 'The server couldn\'t fulfill the request.'
    print 'Error code: ', e.code
except URLError, e:
    print 'We failed to reach a server.'
    print 'Reason: ', e.reason

Fourth example:

import urllib2
import socket


socket.setdefaulttimeout(5)

try:
    response = urllib2.urlopen("http://www.faluquito.com/equipo.php",timeout=5.0).read()   


except urllib2.URLError, e:
    print "Url Error: %r" % e
>>> import urllib2
>>> import time
>>> import contextlib
>>>
>>> def timeit():
...   s = time.time()
...   try:
...     yield
...   except urllib2.URLError:
...     pass
...   print 'took %.3f secs' % (time.time() - s)
...
>>> timeit = contextlib.contextmanager(timeit)
>>> with timeit():
...   r = urllib2.urlopen('http://loc:8080', None, 2)
...
took 2.002 secs
>>> with timeit():
...   r = urllib2.urlopen('http://loc:8080', None, 5)
...
took 5.003 secs

If your machine has the unix program dig, you may be able to identify non-existent urls like this:

import logging
import subprocess
import shlex

logging.basicConfig(level = logging.DEBUG,
                    format = '%(asctime)s %(module)s %(levelname)s: %(message)s',
                    datefmt = '%M:%S')
logger = logging.getLogger(__name__)

urls = ['http://1.2.3.4',
       "http://www.nonexistantdomainurl.com/notexist.php",
       "http://www.faluquito.com/equipo.php",
        'google.com']

nonexistent = ['63.251.179.13', '8.15.7.117']
for url in urls:
    logger.info('Trying {u}'.format(u=url))

    proc = subprocess.Popen(shlex.split(
        'dig +short +time=1 +retry=0 {u}'.format(u = url)),
                            stdout = subprocess.PIPE, stderr = subprocess.PIPE)
    out, err = proc.communicate()
    out = out.splitlines()
    logger.info(out)
    if any(addr in nonexistent for addr in out):
        logger.info('nonexistent\n')
    else:
        logger.info('success\n')

On my machine, this yields:

00:57 test INFO: Trying http://1.2.3.4
00:58 test INFO: ['63.251.179.13', '8.15.7.117']
00:58 test INFO: nonexistent

00:58 test INFO: Trying http://www.nonexistantdomainurl.com/notexist.php
00:58 test INFO: ['63.251.179.13', '8.15.7.117']
00:58 test INFO: nonexistent

00:58 test INFO: Trying http://www.faluquito.com/equipo.php
00:58 test INFO: ['63.251.179.13', '8.15.7.117']
00:58 test INFO: nonexistent

00:58 test INFO: Trying google.com
00:58 test INFO: ['72.14.204.113', '72.14.204.100', '72.14.204.138', '72.14.204.102', '72.14.204.101']
00:58 test INFO: success

Notice that dig returns ['63.251.179.13', '8.15.7.117'] for non-existent urls.

I believe my ISP is changing non-existent addresses to either 63.251.179.13, or 8.15.7.117. Your ISP may do something different. You may have to change nonexistent to something else in that case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM