简体   繁体   English

龙卷风-传输文件到CDN而不会阻塞

[英]tornado - transferring a file to cdn without blocking

I have the nginx upload module handling site uploads, but still need to transfer files (let's say 3-20mb each) to our cdn, and would rather not delegate that to a background job. 我有nginx上载模块来处理站点上载,但是仍然需要将文件(每个3到20mb)传输到我们的CDN,而不希望将其委派给后台作业。

What is the best way to do this with tornado without blocking other requests? 在不阻止其他请求的情况下,用龙卷风执行此操作的最佳方法是什么? Can i do this in an async callback? 我可以在异步回调中执行此操作吗?

You may find it useful in the overall architecture of your site to add a message queuing service such as RabbitMQ . 您可能会发现,在站点的整体体系结构中添加消息队列服务(例如RabbitMQ)很有用。

This would let you complete the upload via the nginx module, then in the tornado handler, post a message containing the uploaded file path and exit. 这样您就可以通过nginx模块完成上传,然后在龙卷风处理程序中发布一条消息,其中包含上传的文件路径并退出。 A separate process would be watching for these messages and handle the transfer to your CDN. 一个单独的过程将监视这些消息并处理向您的CDN的转移。 This type of service would be useful for many other tasks that could be handled offline ( sending emails, etc.. ). 这种类型的服务对于许多其他可以脱机处理的任务(发送电子邮件等)很有用。 As your system grows, this also provides you a mechanism to scale by moving queue processing to separate machines. 随着系统的扩展,这还为您提供了一种通过将队列处理移至单独的计算机进行扩展的机制。

I am using an architecture very similar to this. 我使用的架构与此非常相似。 Just make sure to add your message consumer process to supervisord or whatever you are using to manage your processes. 只要确保将您的消息使用者流程添加到supervisor或用于管理流程的任何内容即可。

In terms of implementation, if you are on Ubuntu installing RabbitMQ is a simple: 就实现而言,如果您在Ubuntu上,则安装RabbitMQ很简单:

sudo apt-get install rabbitmq-server

On CentOS w/EPEL repositories: 在带有EPEL存储库的CentOS上:

yum install rabbit-server

There are a number of Python bindings to RabbitMQ. 有许多Python绑定到RabbitMQ。 Pika is one of them and it happens to be created by an employee of LShift , who is responsible for RabbitMQ. Pika是其中之一,它恰好是由负责RabbitMQ的LShift 员工创建的。

Below is a bit of sample code from the Pika repo. 以下是来自Pika仓库的一些示例代码 You can easily imagine how the handle_delivery method would accept a message containing a filepath and push it to your CDN. 您可以轻松地想象handle_delivery方法将如何接受包含文件路径的消息并将其推送到您的CDN。

import sys
import pika
import asyncore

conn = pika.AsyncoreConnection(pika.ConnectionParameters(
        sys.argv[1] if len(sys.argv) > 1 else '127.0.0.1',
        credentials = pika.PlainCredentials('guest', 'guest')))

print 'Connected to %r' % (conn.server_properties,)

ch = conn.channel()
ch.queue_declare(queue="test", durable=True, exclusive=False, auto_delete=False)

should_quit = False

def handle_delivery(ch, method, header, body):
    print "method=%r" % (method,)
    print "header=%r" % (header,)
    print "  body=%r" % (body,)
    ch.basic_ack(delivery_tag = method.delivery_tag)

    global should_quit
    should_quit = True

tag = ch.basic_consume(handle_delivery, queue = 'test')
while conn.is_alive() and not should_quit:
    asyncore.loop(count = 1)
if conn.is_alive():
    ch.basic_cancel(tag)
    conn.close()

print conn.connection_close

advice on the tornado google group points to using an async callback (documented at http://www.tornadoweb.org/documentation#non-blocking-asynchronous-requests ) to move the file to the cdn. 关于龙卷风Google小组的建议指向使用异步回调(在http://www.tornadoweb.org/documentation#non-blocking-asynchronous-requests中记录 )将文件移动到CDN。

the nginx upload module writes the file to disk and then passes parameters describing the upload(s) back to the view. Nginx上载模块将文件写入磁盘,然后将描述上载的参数传递回视图。 therefore, the file isn't in memory, but the time it takes to read from disk–which would cause the request process to block itself, but not other tornado processes, afaik–is negligible. 因此,文件不在内存中,但是从磁盘读取所花费的时间可以忽略不计,这将导致请求进程自身阻塞,而其他龙卷风进程afaik则无法阻塞。

that said, anything that doesn't need to be processed online shouldn't be, and should be deferred to a task queue like celeryd or similar. 就是说, 不需要在线处理的任何东西都应该被celeryd ,并且应该推迟到诸如celeryd类的任务队列中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM