简体   繁体   English

从金字塔FileResponse下载NamedTemporaryFile是否安全?

[英]Is it safe to download a NamedTemporaryFile from a pyramid FileResponse?

I'm currently working on an export feature for a web application using Pyramid on Python and running on Ubuntu 14.04. 我目前正在使用在Python上使用Pyramid并在Ubuntu 14.04上运行的Web应用程序导出功能。 It zips the files into a NamedTemporaryFile and sends it back through a FileResponse: 它将文件压缩到NamedTemporaryFile中,然后通过FileResponse发送回去:

# Create the temporary file to store the zip
with NamedTemporaryFile(delete=True) as output:
    map_zip = zipfile.ZipFile(output, 'w', zipfile.ZIP_DEFLATED)
    length_mapdir = len(map_directory)

    for root, dirs, files in os.walk(map_directory, followlinks=True):
        for file in files:
            file_path = os.path.join(root, file)
            map_zip.write(file_path, file_path[length_mapdir:])

    map_zip.close()

    #Send the response as an attachement to let the user download the file
    response = FileResponse(os.path.abspath(output.name))
    response.headers['Content-Type'] = 'application/download'
    response.headers['Content-Disposition'] = 'attachement; filename="'+filename+'"'
    return response

On the client's side, the export takes some time then the file download popup appears, nothing goes wrong and everything is in the zip as planned. 在客户端,导出需要一些时间,然后会出现文件下载弹出窗口,没有出错,并且所有内容都按计划包含在zip中。

While the file is zipping, I can see a file taking up more and more size in /tmp/, and before the download popup appears, the file disappears. 在压缩文件时,我可以在/ tmp /中看到一个文件越来越大,并且在出现下载弹出窗口之前,该文件消失了。 I assume this is the NamedTemporaryFile. 我假设这是NamedTemporaryFile。

While the file is being zipped or downloaded, there isn't any significant change in the amount of RAM being used, it stays around 40mb while the actual zip is over 800mb. 当压缩或下载文件时,使用的RAM量没有任何重大变化,它保持在40mb左右,而实际的zip超过800mb。

Where is pyramid downloading the file from? 金字塔从哪里下载文件? From what I understand of tempfile, it is unlinked when it is closed. 据我对tempfile的了解,它在关闭时是未链接的。 If that's true, is it possible another process could write on the memory where the file was stored, corrupting whatever pyramid is downloading? 如果是这样,是否有另一个进程可以在存储文件的内存上写入数据,从而破坏了正在下载的金字塔?

In Unix environments something called reference counting is used when a file is created, and opened. 在Unix环境中,创建和打开文件时会使用一种称为引用计数的方法。 For each open() call on a file, the reference number is increased, for each close() it is decreased. 对于文件上的每个open()调用,引用号都会增加,而对于每个close()引用号都会减少。 unlink() is special in that when that is called the file is unlinked from the directory tree, but will remain on disk so long as the reference count stays above 0. unlink()的特殊之处在于,当调用该文件时,该文件与目录树取消链接,但是只要引用计数保持在0以上,该文件就会保留在磁盘上。

In your case NamedTemporaryFile() creates a file on disk named /tmp/somefile 在您的情况下NamedTemporaryFile()在名为/tmp/somefile磁盘上创建一个文件

  1. /tmp/somefile now has a link count of 1 /tmp/somefile现在的链接数为1
  2. /tmp/somefile then has open() called on it, so that it can return the file to you, this increases the reference count to 1 /tmp/somefile然后调用open() ,以便它可以将文件返回给您,这会将引用计数增加到1
  3. /tmp/somefile is then written to by your code, in this case a zip file 然后/tmp/somefile由您的代码写入,在这种情况下为zip文件
  4. /tmp/somefile is then passed to FileResponse() which then has open() called on it, increasing the reference count to 2 /tmp/somefile然后传递到FileResponse() ,然后在上面调用open() ,将引用计数增加到2
  5. You exit the scope of the with statement, and NamedTemporaryFile() calls close() followed by unlink() . 您退出with语句的作用域,并且NamedTemporaryFile()调用close()然后调用unlink() Your file now has 1 reference to it, and a link count of 0. Due to the reference still existing, the file still exists on disk, but can no longer be seen when searching for it. 您的文件现在具有1个引用,链接计数为0。由于该引用仍然存在,因此该文件仍存在于磁盘上,但是在搜索时不再可见。
  6. FileResponse() is iterated over by your WSGI server, and eventually once the file has been fully read, your WSGI server calls close() on it, dropping the reference count to 0, at which point the file system will clean the file up entirely WSGI服务器对FileResponse()进行迭代,最终,一旦文件被完全读取,WSGI服务器将对其调用close() ,将引用计数降至0,这时文件系统将彻底清理文件

It is at that last point that the file is no longer accessible. 到了最后一点,该文件不再可访问。 In the mean time your file is completely safe and there is no way for it to be overwritten in memory or otherwise. 同时,您的文件是完全安全的,无法在内存中以其他方式覆盖它。

That being said, if FileResponse() was lazy loaded for example (ie it wouldn't open() the file until the WSGI server started sending the response), it would be entirely possible that it would attempt to open() the temporary file too late, and NamedTemporaryFile() would have already deleted the file. 话虽这么说,例如,如果FileResponse()是延迟加载的(即,直到WSGI服务器开始发送响应,它才不会open()文件),则完全有可能尝试open()临时文件。为时已晚,并且NamedTemporaryFile()应该已经删除了该文件。 Just something to keep in mind. 只是要记住一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM