[英]Newlines removed in POST request body? (Google App Engine)
I am building a REST API on Google App Engine (not using Endpoints) that will allow users to upload a CSV or tab-delimited file and search for potential duplicates. 我正在Google App Engine上构建REST API(不使用端点),该API将允许用户上传CSV或制表符分隔的文件并搜索潜在的重复项。 Since it's an API, I cannot use <form>
s or the BlobStore's upload_url
. 由于它是一个API,因此我不能使用<form>
或BlobStore的upload_url
。 I also cannot rely on having a single web client that will call this API. 我也不能依靠拥有一个可以调用此API的Web客户端。 Instead, ideally, users will send the file in the body
of the request. 取而代之的是,理想情况下,用户将在请求的body
中发送文件。
My problem is, when I try to read the content of a tab-delimited file, I find that all newline characters have been removed, so there is no way of splitting the content into rows. 我的问题是,当我尝试读取制表符分隔文件的内容时,我发现所有换行符都已删除,因此无法将内容拆分为行。
If I check the content of the file directly on the Python interpreter, I see that tabs and newlines are there (output is truncated in the example) 如果直接在Python解释器上检查文件的内容,则会看到其中有制表符和换行符(示例中的输出被截断了)
>>> with open('./data/occ_sample.txt') as o:
... o.read()
...
'id\ttype\tmodified\tlanguage\trights\n123456\tPhysicalObject\t2015-11-11 11:50:59.0\ten\thttp://creativecommons.org/licenses/by-nc/3.0\n...'
The RequestHandler
logs the content of the request body: RequestHandler
记录请求正文的内容:
import logging
class ReportApi(webapp2.RequestHandler):
def post(self):
logging.info(self.request.body)
...
So when I call the API running in the dev_appserver
via curl
因此,当我通过curl
调用在dev_appserver
运行的API时
curl -X POST -d @data/occ_sample.txt http://localhost:8080/api/v0/report
This shows up in the logs: 这显示在日志中:
id type modified language rights123456 PhysicalObject 2015-11-11 11:50:59.0 en http://creativecommons.org/licenses/by-nc/3.0
As you can see, there is nothing between the last value of the headers and the first record ( rights
and 123456
respectively) and the same happens with the last value of each record and the first one of the next. 如您所见,标头的最后一个值和第一个记录(分别为rights
和123456
)之间什么都没有,并且每条记录的最后一个值和下一个的第一个记录也是如此。
Am I missing something obvious here? 我在这里错过明显的东西吗? I have tried loading the data with self.request.body
, self.request.body_file
and self.request.POST
, and none seem to work. 我尝试使用self.request.body
, self.request.body_file
和self.request.POST
加载数据,但似乎都无法正常工作。 I also tried applying the Content-Type
values text/csv
, text/plain
, application/csv
in the request headers, with no success. 我还尝试在请求标头中application/csv
Content-Type
值text/csv
, text/plain
, application/csv
,但没有成功。 Should I add a different Content-Type
? 我应该添加其他Content-Type
吗?
You are using the wrong curl
command-line option to send your file data, and it is this option that is stripping the newlines. 您使用了错误的curl
命令行选项来发送文件数据,正是此选项剥离了换行符。
The -d
option parses out your data and sends a application/x-www-form-urlencoded
request, and it strips newlines . -d
选项解析您的数据并发送一个application/x-www-form-urlencoded
请求,并剥离换行符 。 From the curl
manpage : 从curl
联机帮助页 :
-d, --data <data>
[...] [...]
If you start the data with the letter
@
, the rest should be a file name to read the data from, or-
if you want curl to read the data from stdin. 如果您以字母@
开头的数据,其余的应该是一个文件名,以从中读取数据,或者-
如果您希望curl从stdin中读取数据。 Multiple files can also be specified. 也可以指定多个文件。 Posting data from a file named'foobar'
would thus be done with--data @foobar
. 因此,将使用--data @foobar
从名为'foobar'
的文件中发布数据。 When--data
is told to read from a file like that, carriage returns and newlines will be stripped out . 当--data
被告知要从这样的文件中读取时, 回车符和换行符将被删除 。
Bold emphasis mine. 大胆强调我的。
Use the --data-binary
option instead: 使用--data-binary
选项代替:
--data-binary <data>
(HTTP) This posts data exactly as specified with no extra processing whatsoever. (HTTP)这将完全按照指定的方式发布数据,而不会进行任何额外处理。
If you start the data with the letter
@
, the rest should be a filename. 如果您以字母@
开头数据,其余的应该是文件名。 Data is posted in a similar manner as--data-ascii
does, except that newlines and carriage returns are preserved and conversions are never done. 数据的发布方式与--data-ascii
相似, 只是保留换行符和回车符,并且永远不会进行转换。
You may want to include a Content-Type
header in that case; 在这种情况下,您可能需要包含一个Content-Type
标头; of course this depends on your handler if you care about that header. 当然,这取决于您的处理程序(如果您关心该标头)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.