简体   繁体   中英

Get image from image url: IOError: cannot identify image file

I am using Python requests to get an image file from an image url.

The below code works in most cases, but is starting to fail for more and more urls.

import requests
image_url = "<url_here>"
headers = {'User-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36', 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','Accept-Encoding':'gzip,deflate,sdch'}
r = requests.get(image_url, headers=headers)
image = Image.open(cStringIO.StringIO(r.content))

If that gives an error then I try with a different header (this solved issues in the past):

headers = {'User-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36', 'Accept':'image/webp,*/*;q=0.8','Accept-Encoding':'gzip,deflate,sdch'}

However, these urls (among others) don't work. They give an "IOError: cannot identify image file" error.

http://www.paleoeffect.com/wp-content/uploads/2011/06/800x414xpaleo_bread_wheat_recipe-800x414.jpg.pagespeed.ic.6pprrYPoTo.webp

http://cdn.casaveneracion.com/vegetarian/2013/08/vegan-spaghetti1.jpg

http://www.rachaelray.com/site/images/sidebar-heading-more-recipes-2.svg

It shows the images fine in my browser using the urls. I don't know if they have the same issue.

You are using the Python Imaging Library (PIL) to provide the Image class mentioned in the last line of your code.

  • The Paleo Effect image is a WebP file. WebP isn't a supported format by PIL.
  • The Casa Veneracion URL does not link to an image file - it returns a 302 Redirect to an HTML file. ( See for yourself .)
  • The Rachael Ray image is an SVG file. SVG isn't a supported format by PIL.

See bottom of this documentation for Image formats supported by PIL .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM