简体   繁体   中英

how to decode HTTP request headers and body in Python 3?

i am writing an experimental asynchronous web server. i am wondering about the standard / 'best' way to decode HTTP requests in python?

basically what reading from the socket gives me is a bytes representation of the incoming request raw data; how can i turn these into standard datatypes like dictionaries, lists of values, and so on? is there a good general tutorial how to do this and what to be on the watchout for (especially regarding encodings and browser specifics)?

This worked for me:

import StringIO, httplib

ucode_data = unicode( your_raw_data ,"utf-8")
str = StringIO.StringIO( ucode_data )
http_header = httplib.HTTPMessage(str,0)
http_header.readheaders()

print http_header.__dict__

but it does not decode the request (eg, GET /index.html HTTP/1.2) - it will decode the rest for you though

See

20.10.4. HTTPMessage Objects

An http.client.HTTPMessage instance holds the headers from an HTTP response. It is implemented using the email.message.Message class.

http://docs.python.org/py3k/library/http.client.html#httpmessage-objects

You should be able to use the HTTPMessage as a standalone class without invoking urllib (or whatever Python 3 equivalent).

Don't deal with sockets; abstract! Try httplib2 . It's a complete HTTP library for Python 2 and 3, and it is very intuitive, although you have to download and install it. Read its usage example for a quick introduction.

Dive Into Python 3 includes a very good chapter on installing and using httplib2 , and why it's better than other alternatives, including the standard library; I recommend you read that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM