简体   繁体   English

使用 Django 将文件异步上传到 Amazon S3

[英]Asynchronous File Upload to Amazon S3 with Django

I am using this file storage engine to store files to Amazon S3 when they are uploaded:我正在使用这个文件存储引擎在上传文件时将文件存储到 Amazon S3:

http://code.welldev.org/django-storages/wiki/Home http://code.welldev.org/django-storages/wiki/Home

It takes quite a long time to upload because the file must first be uploaded from client to web server, and then web server to Amazon S3 before a response is returned to the client.上传需要很长时间,因为文件必须先从客户端上传到 web 服务器,然后再将 web 服务器上传到 Amazon S3,然后才会向客户端返回响应。

I would like to make the process of sending the file to S3 asynchronous, so the response can be returned to the user much faster.我想让将文件发送到 S3 的过程异步,这样可以更快地将响应返回给用户。 What is the best way to do this with the file storage engine?使用文件存储引擎执行此操作的最佳方法是什么?

Thanks for your advice!谢谢你的建议!

I've taken another approach to this problem.我对这个问题采取了另一种方法。

My models have 2 file fields, one uses the standard file storage backend and the other one uses the s3 file storage backend.我的模型有 2 个文件字段,一个使用标准文件存储后端,另一个使用 s3 文件存储后端。 When the user uploads a file it get's stored localy.当用户上传文件时,它会存储在本地。

I have a management command in my application that uploads all the localy stored files to s3 and updates the models.我的应用程序中有一个管理命令,它将所有本地存储的文件上传到 s3 并更新模型。

So when a request comes for the file I check to see if the model object uses the s3 storage field, if so I send a redirect to the correct url on s3, if not I send a redirect so that nginx can serve the file from disk. So when a request comes for the file I check to see if the model object uses the s3 storage field, if so I send a redirect to the correct url on s3, if not I send a redirect so that nginx can serve the file from disk .

This management command can ofcourse be triggered by any event a cronjob or whatever.这个管理命令当然可以由 cronjob 或其他任何事件触发。

It's possible to have your users upload files directly to S3 from their browser using a special form (with an encrypted policy document in a hidden field).可以让您的用户使用特殊表单(在隐藏字段中带有加密的策略文档)从浏览器直接将文件上传到 S3。 They will be redirected back to your application once the upload completes.上传完成后,它们将被重定向回您的应用程序。

More information here: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1434更多信息在这里: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1434

There is an app for that:-)有一个应用程序:-)

https://github.com/jezdez/django-queued-storage https://github.com/jezdez/django-queued-storage

It does exactly what you need - and much more, because you can set any "local" storage and any "remote" storage.它完全满足您的需求 - 甚至更多,因为您可以设置任何“本地”存储和任何“远程”存储。 This app will store your file in fast "local" storage (for example MogileFS storage) and then using Celery (django-celery), will attempt asynchronous uploading to the "remote" storage.此应用程序会将您的文件存储在快速“本地”存储(例如 MogileFS 存储)中,然后使用Celery (django-celery),将尝试异步上传到“远程”存储。

Few remarks:几点说明:

  1. The tricky thing is - you can setup it to copy&upload, or to upload&delete strategy, that will delete local file once it is uploaded.棘手的是 - 您可以将其设置为复制和上传,或上传和删除策略,一旦上传就会删除本地文件。

  2. Second tricky thing - it will serve file from "local" storage until it is not uploaded.第二个棘手的事情 - 它将从“本地”存储中提供文件,直到它没有被上传。

  3. It also can be configured to make number of retries on uploads failures.它还可以配置为在上传失败时进行重试次数。

Installation & usage is also very simple and straightforward:安装和使用也非常简单明了:

pip install django-queued-storage

append to INSTALLED_APPS : append 到INSTALLED_APPS

INSTALLED_APPS += ('queued_storage',)

in models.py :models.py中:

from queued_storage.backends import QueuedStorage
queued_s3storage = QueuedStorage(
    'django.core.files.storage.FileSystemStorage',
    'storages.backends.s3boto.S3BotoStorage', task='queued_storage.tasks.TransferAndDelete')

class MyModel(models.Model):
    my_file = models.FileField(upload_to='files', storage=queued_s3storage)

You could decouple the process:你可以解耦这个过程:

  • the user selects file to upload and sends it to your server.用户选择要上传的文件并将其发送到您的服务器。 After this he sees a page "Thank you for uploading foofile.txt, it is now stored in our storage backend"在此之后,他看到一个页面“感谢您上传 foofile.txt,它现在存储在我们的存储后端”
  • When the users has uploaded the file it is stored temporary directory on your server and, if needed, some metadata is stored in your database.当用户上传文件时,它会存储在您的服务器上的临时目录中,如果需要,一些元数据会存储在您的数据库中。
  • A background process on your server then uploads the file to S3.然后,您服务器上的后台进程将文件上传到 S3。 This would only possible if you have full access to your server so you can create some kind of "deamon" to to this (or simply use a cronjob).*这只有在您拥有对服务器的完全访问权限时才有可能,这样您就可以为此创建某种“守护程序”(或简单地使用 cronjob)。*
  • The page that is displayed polls asynchronously and displays some kind of progress bar to the user (or s simple "please wait" Message. This would only be needed if the user should be able to "use" (put it in a message, or something like that) it directly after uploading.显示的页面异步轮询并向用户显示某种进度条(或简单的“请等待”消息。仅当用户应该能够“使用”时才需要它(将其放入消息中,或类似的东西)上传后直接。

[*: In case you have only a shared hosting you could possibly build some solution which uses an hidden Iframe in the users browser to start a script which then uploads the file to S3] [*:如果您只有一个共享主机,您可以构建一些解决方案,在用户浏览器中使用隐藏的 Iframe 来启动脚本,然后将文件上传到 S3]

You can directly upload media to the s3 server without using your web application server.您可以直接将媒体上传到 s3 服务器,而无需使用您的 web 应用程序服务器。

See the following references:请参阅以下参考资料:

Amazon API Reference: http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingHTTPPOST.html Amazon API Reference: http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingHTTPPOST.html

A django implementation: https://github.com/sbc/django-uploadify-s3 django 实现: https://github.com/sbc/django-uploadify-s3

I encountered the same issue with uploaded images.我在上传图片时遇到了同样的问题。 You cannot pass along files to a Celery worker because Celery needs to be able to pickle the arguments to a task.您不能将文件传递给 Celery 工作人员,因为 Celery 需要能够将 arguments 腌制到任务中。 My solution was to deconstruct the image data into a string and get all other info from the file, passing this data and info to the task, where I reconstructed the image.我的解决方案是将图像数据解构为一个字符串并从文件中获取所有其他信息,将这些数据和信息传递给我重建图像的任务。 After that you can save it, which will send it to your storage backend (such as S3).之后,您可以保存它,它会将其发送到您的存储后端(例如 S3)。 If you want to associate the image with a model, just pass along the id of the instance to the task and retrieve it there, bind the image to the instance and save the instance.如果要将图像与 model 关联,只需将实例的 id 传递给任务并在那里检索它,将图像绑定到实例并保存实例。

When a file has been uploaded via a form, it is available in your view as a UploadedFile file-like object.通过表单上传文件后,您可以在视图中使用 UploadedFile 文件,如 object。 You can get it directly out of request.FILES, or better first bind it to your form, run is_valid and retrieve the file-like object from form.cleaned_data.您可以直接从 request.FILES 中获取它,或者最好先将其绑定到您的表单,运行 is_valid 并从 form.cleaned_data 检索类似文件的 object。 At that point at least you know it is the kind of file you want it to be.在这一点上,至少你知道它是你想要的那种文件。 After that you can get the data using read(), and get the other info using other methods/attributes.之后,您可以使用 read() 获取数据,并使用其他方法/属性获取其他信息。 See https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/

I actually ended up writing and distributing a little package to save an image asyncly.实际上,我最终编写并分发了一点 package 来异步保存图像。 Have a look at https://github.com/gterzian/django_async Right it's just for images and you could fork it and add functionalities for your situation.看看https://github.com/gterzian/django_async是的,它只是用于图像,您可以分叉它并根据您的情况添加功能。 I'm using it with https://github.com/duointeractive/django-athumb and S3我将它与https://github.com/duointeractive/django-athumb和 S3 一起使用

As some of the answers here suggest uploading directly to S3, here's a Django S3 Mixin using plupload: https://github.com/burgalon/plupload-s3mixin由于这里的一些答案建议直接上传到 S3,这里是使用 plupload 的 Django S3 Mixin: https://github.com/burgalon/plupload-s3mixin

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM