[英]“Lazy” version of asyncio.gather?
I'm using Python's asyncio module and async
/ await
to process a character sequence in chunks concurrently and collect the results in a list. 我正在使用Python的asyncio模块,并且async
/ await
同时处理块中的字符序列并将结果收集在列表中。 For that I'm using a chunker function ( split
) and a chunk processing function ( process_chunk
). 为此,我使用了split
块器功能( split
)和split
块处理功能( process_chunk
)。 They both come from a third-party library, and I would prefer not to change them. 它们都来自第三方库,我不希望对其进行更改。
Chunking is slow, and the number of chunks is not known up front, which is why I don't want to consume the whole chunk generator at once. 分块很慢,并且块的数量事先未知,这就是为什么我不想立即消耗整个块生成器的原因。 Ideally, the code should advance the generator in sync with the process_chunk
's semaphore, ie, every time that function returns. 理想情况下,代码应与process_chunk
的信号量(即,每次函数返回)同步地推进生成器。
My code 我的密码
import asyncio
def split(sequence):
for x in sequence:
print('Getting the next chunk:', x)
yield x
print('Finished chunking')
async def process_chunk(chunk, *, semaphore=asyncio.Semaphore(2)):
async with semaphore:
print('Processing chunk:', chunk)
await asyncio.sleep(3)
return 'OK'
async def process_in_chunks(sequence):
gen = split(sequence)
coro = [process_chunk(chunk) for chunk in gen]
results = await asyncio.gather(*coro)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(process_in_chunks('ABC'))
kind of works and prints 作品和版画的种类
Getting the next chunk: A
Getting the next chunk: B
Getting the next chunk: C
Finished chunking
Processing chunk: C
Processing chunk: B
Processing chunk: A
although that means that the gen
generator is exhausted before the processing begins. 虽然这意味着, gen
的处理开始之前发生器被耗尽。 I know why it happens, but how do change that? 我知道为什么会发生,但是如何改变呢?
If you don't mind having an external dependency, you can use aiostream.stream.map : 如果您不介意具有外部依赖性,则可以使用aiostream.stream.map :
from aiostream import stream, pipe
async def process_in_chunks(sequence):
# Asynchronous sequence of chunks
xs = stream.iterate(split(sequence))
# Asynchronous sequence of results
ys = xs | pipe.map(process_chunk, task_limit=2)
# Aggregation of the results into a list
zs = ys | pipe.list()
# Run the stream
results = await zs
print(results)
The chunks are generated lazily and fed to the process_chunk
coroutine. 这些块是惰性生成的,并被馈送到process_chunk
协程。 The amount of coroutines running concurrently is controlled by task_limit
. 并发运行的协程数量由task_limit
控制。 That means the semaphore in process_chunk
is no longer necessary. 这意味着process_chunk
的信号量不再需要。
Output: 输出:
Getting the next chunk: A
Processing chunk: A
Getting the next chunk: B
Processing chunk: B
# Pause 3 seconds
Getting the next chunk: C
Processing chunk: C
Finished chunking
# Pause 3 seconds
['OK', 'OK', 'OK']
See more examples in this demonstration and the documentation . 请参阅本演示和文档中的更多示例。
next
to iterate through gen
manually 使用next
手动遍历gen
. 。
import asyncio
# third-party:
def split(sequence):
for x in sequence:
print('Getting the next chunk:', x)
yield x
print('Finished chunking')
async def process_chunk(chunk, *, semaphore=asyncio.Semaphore(2)):
async with semaphore:
print('Processing chunk:', chunk)
await asyncio.sleep(3)
return 'OK'
# our code:
sem = asyncio.Semaphore(2) # let's use our semaphore
async def process_in_chunks(sequence):
tasks = []
gen = split(sequence)
while True:
await sem.acquire()
try:
chunk = next(gen)
except StopIteration:
break
else:
task = asyncio.ensure_future(process_chunk(chunk)) # task to run concurently
task.add_done_callback(lambda *_: sem.release()) # allow next chunks to be processed
tasks.append(task)
await asyncio.gather(*tasks, return_exceptions=True) # await all pending task
results = [task.result() for task in tasks]
return results
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(process_in_chunks('ABCDE'))
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()
Output: 输出:
Getting the next chunk: A
Getting the next chunk: B
Processing chunk: A
Processing chunk: B
Getting the next chunk: C
Getting the next chunk: D
Processing chunk: C
Processing chunk: D
Getting the next chunk: E
Finished chunking
Processing chunk: E
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.