[英]Python server cgi.FieldStorage parsing multipart/form-data
so I have been writing a simple web server in Python, and right now I'm trying to handle multipart/form-data POST requests. 所以我一直在用Python编写一个简单的Web服务器,现在我正在尝试处理多部分/表单数据POST请求。 I can already handle application/x-www-form-urlencoded POST requests, but the same code won't work for the multipart.
我已经可以处理application / x-www-form-urlencoded POST请求,但是相同的代码不适用于多部分。 If it looks like I am misunderstanding anything, please call me out, even if it's something minor.
如果看起来我有什么误会,即使有什么小事,也请叫我出来。 Also if you guys have any advice on making my code better please let me know as well :) Thanks!
另外,如果你们对改善我的代码有任何建议,请也告诉我:)谢谢!
When the request comes in, I first parse it, and split it into a dictionary of headers and a string for the body of the request. 当请求进入时,我首先对其进行解析,然后将其拆分为标题的字典和用于请求正文的字符串。 I use those to then construct a FieldStorage form, which I can then treat like a dictionary to pull the data out:
我使用它们来构造一个FieldStorage表单,然后可以将其视为字典来提取数据:
requestInfo = ''
while requestInfo[-4:] != '\r\n\r\n':
requestInfo += conn.recv(1)
requestSplit = requestInfo.split('\r\n')[0].split(' ')
requestType = requestSplit[0]
url = urlparse.urlparse(requestSplit[1])
path = url[2] # Grab Path
if requestType == "POST":
headers, body = parse_post(conn, requestInfo)
print "!!!Request!!! " + requestInfo
print "!!!Body!!! " + body
form = cgi.FieldStorage(headers = headers, fp = StringIO(body), environ = {'REQUEST_METHOD':'POST'}, keep_blank_values=1)
Here's my parse_post method: 这是我的parse_post方法:
def parse_post(conn, headers_string):
headers = {}
headers_list = headers_string.split('\r\n')
for i in range(1,len(headers_list)-2):
header = headers_list[i].split(': ', 1)
headers[header[0]] = header[1]
content_length = int(headers['Content-Length'])
content = conn.recv(content_length)
# Parse Content differently if it's a multipart request??
return headers, content
So for an x-www-form-urlencoded POST request, I can treat FieldStorage form like a dictionary, and if I call, for example: 因此,对于x-www-form-urlencoded POST请求,我可以将FieldStorage形式像字典一样对待,如果调用,例如:
firstname = args['firstname'].value
print firstname
It will work. 它会工作。 However, if I instead send a multipart POST request, it ends up printing nothing.
但是,如果我改为发送多部分POST请求,则最终不打印任何内容。
This is the body of the x-www-form-urlencoded request: firstname=TEST&lastname=rwar 这是x-www-form-urlencoded请求的正文:firstname = TEST&lastname = rwar
This is the body of the multipart request: --070f6a3146974d399d97c85dcf93ed44 Content-Disposition: form-data; 这是多部分请求的主体:--070f6a3146974d399d97c85dcf93ed44 Content-Disposition:form-data; name="lastname";
NAME = “姓氏”; filename="lastname"
文件名=“姓氏”
rwar --070f6a3146974d399d97c85dcf93ed44 Content-Disposition: form-data; rwar --070f6a3146974d399d97c85dcf93ed44 Content-Disposition:表格数据; name="firstname";
NAME = “姓名”; filename="firstname"
文件名=“姓名”
TEST --070f6a3146974d399d97c85dcf93ed44-- 测试--070f6a3146974d399d97c85dcf93ed44--
So here's the question, should I manually parse the body for the data in parse_post if it's a multipart request? 所以这是一个问题,如果是多部分请求,是否应该在parse_post中手动解析正文以获取数据?
Or is there a method that I need/can use to parse the multipart body? 还是有我需要/可以用来解析多部分主体的方法?
Or am I doing this wrong completely? 还是我完全错了?
Thanks again, I know it's a long read but I wanted to make sure my question was comprehensive 再次感谢,我知道这本书读得很长,但是我想确保我的问题很全面
So I solved my problem, but in a totally hacky way. 因此,我解决了我的问题,但是完全是笨拙的。
Ended up manually parsing the body of the request, here's the code I wrote: 最终手动解析了请求的主体,这是我编写的代码:
if("multipart/form-data" in headers["Content-Type"]):
data_list = []
content_list = content.split("\r\n\r\n")
for i in range(len(content_list) - 1):
data_list.append("")
data_list[0] += content_list[0].split("name=")[1].split(";")[0].replace('"','') + "="
for i,c in enumerate(content_list[1:-1]):
key = c.split("name=")[1].split(";")[0].replace('"','')
data_list[i+1] += key + "="
value = c.split("\r\n")
data_list[i] += value[0]
data_list[-1] += content_list[-1].split("\r\n")[0]
content = "&".join(data_list)
If anybody can still solve my problem without having to manually parse the body, please let me know! 如果有人仍然可以解决我的问题而不必手动解析身体,请告诉我!
There's the streaming-form-data project that provides a Python parser to parse data that's multipart/form-data
encoded. 有一个streaming-form-data项目,该项目提供了一个Python解析器来解析由
multipart/form-data
编码multipart/form-data
。 It's intended to allow parsing data in chunks, but since there's no chunk size enforced, you could just pass your entire input at once and it should do the job. 它旨在允许以块的形式解析数据,但是由于没有强制执行块大小,因此您可以一次传递整个输入,它就可以完成工作。 It should be installable via
pip install streaming_form_data
. 它应该可以通过
pip install streaming_form_data
。
Here's the source code - https://github.com/siddhantgoel/streaming-form-data 这是源代码-https://github.com/siddhantgoel/streaming-form-data
Documentation - https://streaming-form-data.readthedocs.io/en/latest/ 文档-https: //streaming-form-data.readthedocs.io/en/latest/
Disclaimer: I'm the author. 免责声明:我是作者。 Of course, please create an issue in case you run into a bug.
当然,如果遇到错误,请创建一个问题。 :)
:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.