简体   繁体   English

urllib没有给我正确的文件类型

[英]urllib doesn't give me correct filetype

I'm using the urllib python module to get images from an external URL. 我正在使用urllib python模块从外部URL获取图像。 It works well, but there are some images that give me problems like this: https://cdn.tutsplus.com/wp/uploads/2014/01/grunt-logo-400.png 它工作正常,但是有些图像给我这样的问题: https : //cdn.tutsplus.com/wp/uploads/2014/01/grunt-logo-400.png

My code is the following 我的代码如下

import urllib
img = urllib.urlretrieve("https://cdn.tutsplus.com/wp/uploads/2014/01/grunt-logo-400.png")

When I print img it show me: "/tmp/tmpbuhfUW.png" 当我打印img时,会显示:“ /tmp/tmpbuhfUW.png”

But if I print img[1].type it gives me: "text/html" 但是,如果我打印img [1] .type,它会显示:“ text / html”

So the filetype is incorrect. 因此,文件类型不正确。

Is there anny solution? 有安妮的解决方案吗?

PS: I checked my /tmp folder where the image is downloaded and I noticed the image is blank. PS:我检查了下载图像的/ tmp文件夹,发现图像为空白。 PS2: also I've tried with urllib2.urlopen("cdn.tutsplus.com/wp/uploads/2014/01/grunt-logo-400.png") But it gives me error 403 PS2:我也尝试使用urllib2.urlopen(“ cdn.tutsplus.com/wp/uploads/2014/01/grunt-logo-400.png”)但它给我错误403

UPDATE: Finally I solved it by doing the following: 更新:最后,我通过执行以下操作解决了它:

class MyOpener(urllib.FancyURLopener):
    version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'

myopener = MyOpener()
i = myopener.retrieve("https://cdn.tutsplus.com/wp/uploads/2014/01/grunt-logo-400.png")

Now it prints the filetype as "image/png" 现在,它将文件类型打印为“ image / png”

As far as I can tell, you aren't doing anything wrong. 据我所知,您没有做错任何事情。 Urllib is just guessing the mime type incorrectly. Urllib只是猜测错误的mime类型。 I don't know exactly what you're trying to do, but you could say 我不知道你到底想做什么,但你可以说

filetype = img[0].split('.')[1]

to retrieve the filetype, and then check if it was contained in an array of different image filetypes to determine if it was a link to an image. 检索文件类型,然后检查它是否包含在不同图像文件类型的数组中,以确定它是否是图像的链接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM