![](/img/trans.png)
[英]Python: Slow generator streaming into fast consumer depletes buffer and terminates early
[英]async generator with slow consumer
如果我有一个以快速频率发出值的异步生成器的缓慢消费者,并且我只关心使用最新的值(即我不关心丢弃值),有没有办法以雄辩的方式实现这一点? 我已经查看了aiostream ,但似乎找不到任何合适的东西。
这是一个简单的例子:
import asyncio
import aiostream
async def main():
xs = aiostream.stream.count(interval=0.2)
async with xs.stream() as stream:
async for x in stream: # do something here to drop updates that aren't processed in time
print(x)
await asyncio.sleep(1.0)
if __name__ == "__main__":
asyncio.run(main())
我建议您使用一个处理外部生成器的类,因为我不知道有任何来源可以做到这一点。
该类可以在任务内部使用生成器,并且只保留最后一个值。 它就像是您真正想要使用的生成器的包装器。
import asyncio
class RelaxedGenerator:
def __init__(self, async_gen):
self.last_value = None # the last value generated
self.consumed_last = True # flags the last value as consumed
self.async_gen = async_gen # generator which we can drop values
self.exhausted = False # flags the generator as fully consumed
@classmethod
async def start(cls, async_gen):
self = cls(async_gen())
asyncio.create_task(self.generate())
return self
async def generate(self):
# here you can consume the external async generator
# and save only the last value for further process
while True:
try:
self.last_value = await self.async_gen.__anext__()
self.consumed_last = False
except StopAsyncIteration:
self.exhausted = True
break
async def stream(self):
while not self.exhausted:
if self.consumed_last:
await asyncio.sleep(0.01) # avoids block the loop
continue
self.consumed_last = True
yield self.last_value
使用简单的生成器进行测试:
import asyncio
from random import uniform
async def numbers_stream(max_=100):
next_int = -1
while next_int < max_:
next_int += 1
yield next_int
await asyncio.sleep(0.2)
async def main():
gen = await RelaxedGenerator.start(numbers_stream)
async for value in gen.stream():
print(value, end=", ", flush=True)
await asyncio.sleep(uniform(1, 2))
asyncio.run(main())
输出:
0, 6, 15, 21, 28, 38, 43, 48, 57, 65, 73, 81, 89, 96,
其他要记住的事情是,如果您要处理最后一个值,或者您正在使用的生成器是否会在实践中耗尽或不使用。 在这里,我假设您不关心最后一个值,并且生成器可能会耗尽。
您可以在生产者和消费者之间添加一个忘记旧结果的队列。 不幸的是,标准库中没有它的实现,但它几乎就在那里。 如果您检查asyncio.Queue
的实现,您会注意到collections.deque
的使用,请参阅https://github.com/python/cpython/blob/3.10/Lib/asyncio/queues.py#L49 。 collections.deque
采用可选参数maxlen
来丢弃以前添加的项目,请参阅https://docs.python.org/3/library/collections.html#collections.deque 。 利用它,我们可以创建自定义队列,它只保留最后 n 个项目。
import asyncio
import collections
class RollingQueue(asyncio.Queue):
def _init(self, maxsize):
self._queue = collections.deque(maxlen=maxsize)
def full(self):
return False
现在您可以按如下方式使用此队列:
async def numbers(nmax):
for n in range(nmax):
yield n
await asyncio.sleep(0.3)
async def fill_queue(producer, queue):
async for item in producer:
queue.put_nowait(item)
queue.put_nowait(None)
queue1 = RollingQueue(1)
numgen = numbers(10)
task = fill_queue(numgen, queue1)
asyncio.create_task(task)
while True:
res = await queue1.get()
if res is None:
break
print(res)
await asyncio.sleep(1)
我将队列大小设置为 1 以保留问题中要求的最后一项。
结合使用提供的两个答案,我想出了以下似乎效果很好的解决方案:
import asyncio
import aiostream
import collections
class RollingQueue(asyncio.Queue):
def _init(self, maxsize):
self._queue = collections.deque(maxlen=maxsize)
def full(self):
return False
@aiostream.operator(pipable=True)
async def drop_stream(source, max_n=1):
queue = RollingQueue(max_n)
exhausted = False
async def inner_task():
async with aiostream.streamcontext(source) as streamer:
async for item in streamer:
queue.put_nowait(item)
nonlocal exhausted
exhausted = True
task = asyncio.create_task(inner_task())
try:
while not exhausted:
item = await queue.get()
yield item
finally:
task.cancel()
async def main():
xs = aiostream.stream.count(interval=0.2) | drop_stream.pipe(1) | aiostream.pipe.take(5)
async with xs.stream() as stream:
async for x in stream:
print(x)
await asyncio.sleep(1.0)
if __name__ == "__main__":
asyncio.run(main())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.