Python：如何获取 URL 的内容类型？

Question

我需要获取 Internet（内联网）资源的内容类型而不是本地文件。 如何从 URL 后面的资源获取 MIME 类型：

我试过这个：

res = urllib.urlopen("http://www.iana.org/assignments/language-subtag-registry")
http_message = res.info()
message = http_message.getplist()

我得到： ['charset=UTF-8']

我怎样才能得到Content-Type ，可以使用urllib来完成，如果不是，还有其他方法是什么？

Answer 1

res = urllib.urlopen("http://www.iana.org/assignments/language-subtag-registry" )
http_message = res.info()
full = http_message.type # 'text/plain'
main = http_message.maintype # 'text'

Answer 2

Python3解决方案：

import urllib.request
with urllib.request.urlopen('http://www.google.com') as response:
    info = response.info()
    print(info.get_content_type())      # -> text/html
    print(info.get_content_maintype())  # -> text
    print(info.get_content_subtype())   # -> html

Answer 3

更新：由于 info() 函数在 Python 3.9 中已弃用，您可以在此处阅读有关称为标头的首选类型

import urllib

r = urllib.request.urlopen(url)
header = r.headers                              # type is email.message.EmailMessage
contentType = header.get_content_type()         # or header.get('content-type')
contentLength = header.get('content-length')
filename = header.get_filename()

另外，这是一种无需实际加载 url 即可快速获取 mimetype 的好方法

import mimetypes
contentType, encoding = mimetypes.guess_type(url)

第二种方法不能保证得到答案，但它是一种快速而肮脏的技巧，因为它只是查看 URL 字符串而不是实际打开 URL。

Python：如何获取 URL 的内容类型？

问题描述

3 个解决方案

解决方案1
17 已采纳 2012-09-18 10:03:29

解决方案2
12 2016-04-27 07:07:42

解决方案3
2 2022-02-23 14:57:09

Python：如何获取 URL 的内容类型？

问题描述

3 个解决方案

解决方案1 17 已采纳 2012-09-18 10:03:29

解决方案2 12 2016-04-27 07:07:42

解决方案3 2 2022-02-23 14:57:09

解决方案1
17 已采纳 2012-09-18 10:03:29

解决方案2
12 2016-04-27 07:07:42

解决方案3
2 2022-02-23 14:57:09