简体   繁体   English

如何有效地迭代多个生成器?

[英]How to efficiently iterate over multiple generators?

I've got three different generators, which yields data from the web. 我有三个不同的生成器,它们可以从网络上获取数据。 Therefore, each iteration may take a while until it's done. 因此,每次迭代可能需要一段时间才能完成。

I want to mix the calls to the generators, and thought about roundrobin (Found here ). 我想混合调用生成器,并考虑roundrobin(在这里找到)。 The problem is that every call is blocked until it's done. 问题是每次通话都会被阻止,直到完成为止。

Is there a way to loop through all the generators at the same time, without blocking? 有没有办法在不阻塞的情况下同时遍历所有生成器?

You can do this with the iter() method on my ThreadPool class. 您可以使用我的ThreadPool类上的iter()方法执行此操作。

pool.iter() yields threaded function return values until all of the decorated+called functions finish executing. pool.iter()产生线程函数返回值,直到所有装饰+被调用函数完成执行。 Decorate all of your async functions, call them, then loop through pool.iter() to catch the values as they happen. 装饰所有异步函数,调用它们,然后遍历pool.iter()以捕获它们发生的值。

Example: 例:

import time
from threadpool import ThreadPool
pool = ThreadPool(max_threads=25, catch_returns=True)

# decorate any functions you need to aggregate
# if you're pulling a function from an outside source
# you can still say 'func = pool(func)' or 'pool(func)()
@pool
def data(ID, start):
    for i in xrange(start, start+4):
        yield ID, i
        time.sleep(1)

# each of these calls will spawn a thread and return immediately
# make sure you do either pool.finish() or pool.iter()
# otherwise your program will exit before the threads finish
data("generator 1", 5)
data("generator 2", 10)
data("generator 3", 64)

for value in pool.iter():
    # this will print the generators' return values as they yield
    print value

In short, no: there's no good way to do this without threads. 简而言之,没有:没有线程就没有好办法做到这一点。

Sometimes ORMs are augmented with some kind of peek function or callback that will signal when data is available. 有时,ORM会增加一些窥视功能或回调,当数据可用时会发出信号。 Otherwise, you'll need to spawn threads in order to do this. 否则,您需要生成线程才能执行此操作。 If threads are not an option, you might try switching out your database library for an asynchronous one. 如果线程不是一个选项,您可能会尝试将数据库库切换为异步数据库。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM