简体   繁体   中英

How to get python to successfully download large images from the internet

So I've been using

urllib.request.urlretrieve(URL, FILENAME)

to download images of the internet. It works great, but fails on some images. The ones it fails on seem to be the larger images- eg. http://i.imgur.com/DEKdmba.jpg . It downloads them fine, but when I try to open these files photo viewer gives me the error "windows photo viewer cant open this picture because the file appears to be damaged corrupted or too large".

What might be the reason it can't download these, and how can I fix this?

EDIT: after looking further, I dont think the problem is large images- it manages to download larger ones. It just seems to be some random ones that it can never download whenever I run the script again. Now I'm even more confused

In the past, I have used this code for copying from the internet. I have had no trouble with large files.

def download(url):
    file_name = raw_input("Name: ")
    u = urllib2.urlopen(url)
    f = open(file_name, 'wb')
    meta = u.info()
    file_size = int(meta.getheaders("Content-Length")[0])
    print "Downloading: %s Bytes: %s" % (file_name, file_size)  
    file_size_dl = 0
    block_size = 8192
    while True:
        buffer = u.read(block_size)
        if not buffer:
            break 

Here's the sample code for Python 3 (tested in Windows 7):

import urllib.request

def download_very_big_image():
    url = 'http://i.imgur.com/DEKdmba.jpg'
    filename = 'C://big_image.jpg'
    conn = urllib.request.urlopen(url)
    output = open(filename, 'wb') #binary flag needed for Windows
    output.write(conn.read())
    output.close()

For completeness sake, here's the equivalent code in Python 2:

import urllib2

def download_very_big_image():
    url = 'http://i.imgur.com/DEKdmba.jpg'
    filename = 'C://big_image.jpg'
    conn = urllib2.urlopen(url)
    output = open(filename, 'wb') #binary flag needed for Windows
    output.write(conn.read())
    output.close()

This should work: use requests module:

import requests

img_url = 'http://i.imgur.com/DEKdmba.jpg'
img_name = img_url.split('/')[-1]
img_data = requests.get(img_url).content
with open(img_name, 'wb') as handler:
    handler.write(img_data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM