Python服务器cgi.FieldStorage解析multipart / form-data

Question

so I have been writing a simple web server in Python, and right now I'm trying to handle multipart/form-data POST requests. 所以我一直在用Python编写一个简单的Web服务器，现在我正在尝试处理多部分/表单数据POST请求。 I can already handle application/x-www-form-urlencoded POST requests, but the same code won't work for the multipart. 我已经可以处理application / x-www-form-urlencoded POST请求，但是相同的代码不适用于多部分。 If it looks like I am misunderstanding anything, please call me out, even if it's something minor. 如果看起来我有什么误会，即使有什么小事，也请叫我出来。 Also if you guys have any advice on making my code better please let me know as well :) Thanks! 另外，如果你们对改善我的代码有任何建议，请也告诉我:)谢谢！

When the request comes in, I first parse it, and split it into a dictionary of headers and a string for the body of the request. 当请求进入时，我首先对其进行解析，然后将其拆分为标题的字典和用于请求正文的字符串。 I use those to then construct a FieldStorage form, which I can then treat like a dictionary to pull the data out: 我使用它们来构造一个FieldStorage表单，然后可以将其视为字典来提取数据：

requestInfo = ''
while requestInfo[-4:] != '\r\n\r\n':
    requestInfo += conn.recv(1)

requestSplit = requestInfo.split('\r\n')[0].split(' ')
requestType = requestSplit[0]

url = urlparse.urlparse(requestSplit[1])
path = url[2] # Grab Path

if requestType == "POST":
    headers, body = parse_post(conn, requestInfo)

    print "!!!Request!!! " + requestInfo
    print "!!!Body!!! " + body 
    form = cgi.FieldStorage(headers = headers, fp = StringIO(body), environ = {'REQUEST_METHOD':'POST'}, keep_blank_values=1)

Here's my parse_post method: 这是我的parse_post方法：

def parse_post(conn, headers_string):
    headers = {}
    headers_list = headers_string.split('\r\n')

    for i in range(1,len(headers_list)-2):
        header = headers_list[i].split(': ', 1)
        headers[header[0]] = header[1]

    content_length = int(headers['Content-Length'])

    content = conn.recv(content_length)

    # Parse Content differently if it's a multipart request??

    return headers, content

So for an x-www-form-urlencoded POST request, I can treat FieldStorage form like a dictionary, and if I call, for example: 因此，对于x-www-form-urlencoded POST请求，我可以将FieldStorage形式像字典一样对待，如果调用，例如：

firstname = args['firstname'].value
print firstname

It will work. 它会工作。 However, if I instead send a multipart POST request, it ends up printing nothing. 但是，如果我改为发送多部分POST请求，则最终不打印任何内容。

This is the body of the x-www-form-urlencoded request: firstname=TEST&lastname=rwar 这是x-www-form-urlencoded请求的正文：firstname = TEST＆lastname = rwar

This is the body of the multipart request: --070f6a3146974d399d97c85dcf93ed44 Content-Disposition: form-data; 这是多部分请求的主体：--070f6a3146974d399d97c85dcf93ed44 Content-Disposition：form-data; name="lastname"; NAME = “姓氏”; filename="lastname" 文件名=“姓氏”

rwar --070f6a3146974d399d97c85dcf93ed44 Content-Disposition: form-data; rwar --070f6a3146974d399d97c85dcf93ed44 Content-Disposition：表格数据； name="firstname"; NAME = “姓名”; filename="firstname" 文件名=“姓名”

TEST --070f6a3146974d399d97c85dcf93ed44-- 测试--070f6a3146974d399d97c85dcf93ed44--

So here's the question, should I manually parse the body for the data in parse_post if it's a multipart request? 所以这是一个问题，如果是多部分请求，是否应该在parse_post中手动解析正文以获取数据？

Or is there a method that I need/can use to parse the multipart body? 还是有我需要/可以用来解析多部分主体的方法？

Or am I doing this wrong completely? 还是我完全错了？

Thanks again, I know it's a long read but I wanted to make sure my question was comprehensive 再次感谢，我知道这本书读得很长，但是我想确保我的问题很全面

Answer 1

So I solved my problem, but in a totally hacky way. 因此，我解决了我的问题，但是完全是笨拙的。

Ended up manually parsing the body of the request, here's the code I wrote: 最终手动解析了请求的主体，这是我编写的代码：

if("multipart/form-data" in headers["Content-Type"]):
    data_list = []
    content_list = content.split("\r\n\r\n")
    for i in range(len(content_list) - 1):
        data_list.append("")

    data_list[0] += content_list[0].split("name=")[1].split(";")[0].replace('"','') + "="

    for i,c in enumerate(content_list[1:-1]):
        key = c.split("name=")[1].split(";")[0].replace('"','')
        data_list[i+1] += key + "="
        value = c.split("\r\n")
        data_list[i] += value[0]

    data_list[-1] += content_list[-1].split("\r\n")[0]

    content = "&".join(data_list)

If anybody can still solve my problem without having to manually parse the body, please let me know! 如果有人仍然可以解决我的问题而不必手动解析身体，请告诉我！

Answer 2

There's the streaming-form-data project that provides a Python parser to parse data that's multipart/form-data encoded. 有一个streaming-form-data项目，该项目提供了一个Python解析器来解析由multipart/form-data编码multipart/form-data 。 It's intended to allow parsing data in chunks, but since there's no chunk size enforced, you could just pass your entire input at once and it should do the job. 它旨在允许以块的形式解析数据，但是由于没有强制执行块大小，因此您可以一次传递整个输入，它就可以完成工作。 It should be installable via pip install streaming_form_data . 它应该可以通过pip install streaming_form_data 。

Here's the source code - https://github.com/siddhantgoel/streaming-form-data 这是源代码-https://github.com/siddhantgoel/streaming-form-data

Documentation - https://streaming-form-data.readthedocs.io/en/latest/ 文档-https: //streaming-form-data.readthedocs.io/en/latest/

Disclaimer: I'm the author. 免责声明：我是作者。 Of course, please create an issue in case you run into a bug. 当然，如果遇到错误，请创建一个问题。 :) :)

Python服务器cgi.FieldStorage解析multipart / form-data

问题描述

2 个解决方案

解决方案1
2 已采纳 2014-02-24 22:32:15

解决方案2
0 2017-06-20 19:46:00

Python服务器cgi.FieldStorage解析multipart / form-data

问题描述

2 个解决方案

解决方案1 2 已采纳 2014-02-24 22:32:15

解决方案2 0 2017-06-20 19:46:00

解决方案1
2 已采纳 2014-02-24 22:32:15

解决方案2
0 2017-06-20 19:46:00