My application makes numerous HTTP requests. Without writing a regular expression, how do I parse Content-Type
header values? For example:
text/html; charset=UTF-8
For context, here is my code for getting stuff in the internet:
from requests import head
foo = head("http://www.example.com")
The output I am expecting is similar to what the methods do in mimetools . For example:
x = magic("text/html; charset=UTF-8")
Will output:
x.getparam('charset') # UTF-8
x.getmaintype() # text
x.getsubtype() # html
requests
doesn't give you an interface to parse the content type, unfortunately, and the standard library on this stuff is a bit of a mess. So I see two options:
Option 1 : Go use the python-mimeparse third-party library.
Option 2 : To separate the mime type from options like charset
, you can use the same technique that requests
uses to parse type/encoding internally: use cgi.parse_header
.
response = requests.head('http://example.com')
mimetype, options = cgi.parse_header(response.headers['Content-Type'])
The rest should be simple enough to handle with a split
:
maintype, subtype = mimetype.split('/')
Your question is bit unclear. I assume that you are using some sort of web application framework such as Django or Flask?
Here is example how to read Content-Type using Flask:
from flask import Flask, request
app = Flask(__name__)
@app.route("/")
def test():
request.headers.get('Content-Type')
if __name__ == "__main__":
app.run()
Your response ( foo
) will have a dictionary with the headers. Try something like:
foo.headers.get('content-type')
Or print foo.headers
to see all the headers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.