简体   繁体   English

如何使用龙卷风gen.Task / gen.coroutine装饰器实现并行性

[英]How to achieve parallelism with tornado gen.Task / gen.coroutine decorators

Here is a case, when one must bring in parallelism into the backend server. 在这种情况下,必须将并行处理引入后端服务器。

I am willing to query N ELB's, each for 5 different queries, and send the result back to the web client. 我愿意查询N ELB,每个查询5个不同的查询,并将结果发送回Web客户端。

The backend is Tornado, and according to what I have read many times in the docs , in the past, I should be able to get several tasks processed in parallel if I use @gen.Task or gen.coroutine. 后端是Tornado,根据我过去在文档中多次阅读的内容,如果使用@ gen.Task或gen.coroutine,我应该能够并行处理多个任务。

However, I must be missing something in here, as all my requests are (20 in number, 4 elbs * 5 queries) are processed one after another. 但是,我必须在这里丢失一些内容,因为我的所有请求(数量为20,4个elb * 5个查询)都是一个接一个地处理的。

def query_elb(fn, region, elb_name, period, callback):
    callback(fn (region, elb_name, period))

class DashboardELBHandler(RequestHandler):

    @tornado.gen.coroutine
    def get_elb_info(self, region, elb_name, period):
        elbReq = yield gen.Task(query_elb, ELBSumRequest, region, elb_name, period)
        elb2XX = yield gen.Task(query_elb, ELBBackend2XX, region, elb_name, period)
        elb3XX = yield gen.Task(query_elb, ELBBackend3XX, region, elb_name, period)
        elb4XX = yield gen.Task(query_elb, ELBBackend4XX, region, elb_name, period)
        elb5XX = yield gen.Task(query_elb, ELBBackend5XX, region, elb_name, period)

        raise tornado.gen.Return( 
            [
                elbReq,
                elb2XX,
                elb3XX,
                elb4XX,
                elb5XX,
            ]
        )

    @tornado.web.authenticated
    @tornado.web.asynchronous
    @tornado.gen.coroutine
    def post(self):
        ret = []

        period = self.get_argument("period", "5m")

        cloud_deployment = db.foo.bar.baz()
        for region, deployment in cloud_deployment.iteritems():

            elb_name = deployment["elb"][0]
            res = yield self.get_elb_info(region, elb_name, period)
            ret.append(res)

        self.push_json(ret)



def ELBQuery(region, elb_name,  range_name, metric, statistic, unit):
    dimensions = { u"LoadBalancerName": [elb_name] }

    (start_stop , period) = calc_range(range_name)

    cw = boto.ec2.cloudwatch.connect_to_region(region)
    data_points = cw.get_metric_statistics( period, start, stop, 
        metric, "AWS/ELB", statistic, dimensions, unit)    

    return data_points

ELBSumRequest   = lambda region, elb_name, range_name : ELBQuery(region, elb_name, range_name,  "RequestCount", "Sum", "Count")
ELBLatency      = lambda region, elb_name, range_name : ELBQuery(region, elb_name, range_name,  "Latency", "Average", "Seconds")
ELBBackend2XX   = lambda region, elb_name, range_name : ELBQuery(region, elb_name, range_name,  "HTTPCode_Backend_2XX", "Sum", "Count")
ELBBackend3XX   = lambda region, elb_name, range_name : ELBQuery(region, elb_name, range_name,  "HTTPCode_Backend_3XX", "Sum", "Count")
ELBBackend4XX   = lambda region, elb_name, range_name : ELBQuery(region, elb_name, range_name,  "HTTPCode_Backend_4XX", "Sum", "Count")
ELBBackend5XX   = lambda region, elb_name, range_name : ELBQuery(region, elb_name, range_name,  "HTTPCode_Backend_5XX", "Sum", "Count")

The problem is that ELBQuery is a blocking function. 问题在于ELBQuery是一个阻止函数。 If it doesn't yield another coroutine somewhere, there is no way for the coroutine scheduler to interleave the calls. 如果在某个地方没有yield其他协程,则协程调度程序将无法交叉调用。 (That's the whole point of coroutines—they're cooperative, not preemptive.) (这是协程的全部要点-它们是合作的,而不是抢先的。)

If the problem is something like the calc_range call, that would probably be easy to deal with—break it up into smaller pieces where each one yields to the next, which gives the scheduler a chance to get in between each piece. 如果问题是类似于calc_range调用的问题,则可能很容易处理-将其分解成较小的部分,每个部分calc_range下一个,这使调度程序有机会进入每个部分。

But most likely, it's the boto calls that are blocking, and most of your function's time is spent waiting around for get_metric_statistics to return, while nothing else can run. 但是最有可能的是,boto调用被阻塞了,并且函数的大部分时间都花在等待get_metric_statistics返回,而其他任何事情都无法运行。

So, how do you fix this? 那么,您如何解决呢?

  1. Spin off a thread for each boto task. 为每个boto任务衍生一个线程。 Tornado makes it pretty easy to transparently wrap a coroutine around a thread or thread-pool task, which magically unblocks everything. 龙卷风使得将协程透明地包裹在线程或线程池任务上变得非常容易,这神奇地解除了所有阻塞。 But of course there's a cost to using threads too. 但是,当然,使用线程也是有代价的。
  2. Schedule the boto tasks on a thread pool instead of a thread apiece. 将Boto任务安排在线程池上,而不是每个线程上。 Similar tradeoffs to #1, especially if you only have a handful of tasks. 与#1相似的折衷方案,尤其是在您只有少量任务的情况下。 (But if you could be doing 5 tasks each for 500 different users, you probably want a shared pool.) (但是,如果您可能要为500个不同的用户分别执行5个任务,则可能需要共享池。)
  3. Rewrite or monkeypatch boto to use coroutines. 重写或猴子补丁boto以使用协程。 This would be the ideal solution… but it's the most work (and the most risk of breaking code you don't understand, and having to maintain it as boto updates, etc.). 这将是理想的解决方案……但这是最繁琐的工作(并且是破坏您不了解的代码以及将其作为boto更新进行维护的最大风险,等等)。 However, there are people who have at least gotten started on this, like the asyncboto project. 但是,有些人至少已经开始这样做,例如asyncboto项目。
  4. Use greenlets and monkeypatch enough of the library's dependencies to trick it into being async. 使用greenlets和Monkeypatch足够的库依赖项来欺骗它成为异步的。 This sounds hacky, but it may actually be the best solution; 这听起来很棘手,但实际上可能是最好的解决方案。 see Marrying Boto to Tornado for this. 见为此嫁给Boto到龙卷风
  5. Use greenlets and monkeypatch the whole stdlib ala gevent to trick boto and tornado to work together without even realizing it. 使用greenlets并在整个stdlib ala gevent以欺骗boto和龙卷风一起工作,甚至没有意识到。 This sounds like a terrible idea; 这听起来像一个可怕的主意。 you'd be better off porting your whole app to gevent . 您最好将整个应用程序移植到gevent
  6. Use a separate process (or even a pool of them) that uses something like gevent . 使用类似gevent类的单独进程(甚至是它们的池)。

Without knowing more details, I'd suggest looking at #2 and #4 first, but I can't promise they'll turn out to be the best answer for you. 在不了解更多详细信息的情况下,我建议您先查看#2和#4,但我不能保证它们会成为您的最佳选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM