PIL: image from url, cannot identify image file

Question

I am trying to access an image from a url:

http://www.lifeasastrawberry.com/wp-content/uploads/2013/04/IMG_1191-1024x682.jpg

However, it fails with IOError("cannot identify image file") in the last step. Not sure what is going on or how to fix it. It has worked with many other url images.

    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    opener.addheaders = [('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')]
    opener.addheaders = [('Accept-Encoding', 'gzip,deflate,sdch')]

    response = opener.open(image_url,None,5)
    img_file = cStringIO.StringIO(response.read())  

    image = Image.open(img_file)

this url also fails:

http://www.canadianliving.com/img/photos/biz/Greek-Yogurt-Ceaser-Salad-Dressi1365783448.jpg

Answer 1

The problem is that you're telling your URL retriever to ask for a gzip-encoded result from the server, so the image data that you receive is gzip-encoded. You can solve this by either leaving off the accept-encoding header from your request, or by decompressing the gzip-encoded result manually :

from PIL import Image
import urllib2
import gzip
import cStringIO

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.addheaders = [('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')]
opener.addheaders = [('Accept-Encoding', 'gzip,deflate,sdch')]

gzipped_file = cStringIO.StringIO(opener.open(url, None, 5).read())
image = Image.open(gzip.GzipFile(fileobj=gzipped_file))

The problem with this approach is that if you accept multiple encodings in your HTTP request, then you'll need to look in the HTTP headers of the result to see which encoding you actually got, and then decode manually based on whatever that value indicates.

I think it's easier to set the accept-encoding header to a value such that you will only accept one encoding (eg, 'identity;q=1, *;q=0' or something like that), or go ahead and start using the requests package to do HTTP.

PIL: image from url, cannot identify image file

Question

1 answers

solution1
1 ACCPTED 2013-09-01 18:02:35

PIL: image from url, cannot identify image file

Question

1 answers

solution1 1 ACCPTED 2013-09-01 18:02:35

solution1
1 ACCPTED 2013-09-01 18:02:35