繁体   English   中英

具有慢消费者的异步生成器

[英]async generator with slow consumer

如果我有一个以快速频率发出值的异步生成器的缓慢消费者,并且我只关心使用最新的值(即我不关心丢弃值),有没有办法以雄辩的方式实现这一点? 我已经查看了aiostream ,但似乎找不到任何合适的东西。

这是一个简单的例子:

import asyncio
import aiostream

async def main():

    xs = aiostream.stream.count(interval=0.2)

    async with xs.stream() as stream:
        async for x in stream: # do something here to drop updates that aren't processed in time
            print(x)
            await asyncio.sleep(1.0)


if __name__ == "__main__":
    asyncio.run(main())

我建议您使用一个处理外部生成器的类,因为我不知道有任何来源可以做到这一点。

该类可以在任务内部使用生成器,并且只保留最后一个值。 它就像是您真正想要使用的生成器的包装器。

import asyncio

class RelaxedGenerator:

   def __init__(self, async_gen):
      self.last_value = None        # the last value generated
      self.consumed_last = True     # flags the last value as consumed
      self.async_gen = async_gen    # generator which we can drop values
      self.exhausted = False        # flags the generator as fully consumed

   @classmethod
   async def start(cls, async_gen):
      self = cls(async_gen())
      asyncio.create_task(self.generate())
      return self

   async def generate(self):
      # here you can consume the external async generator
      # and save only the last value for further process
      while True:
         try:
            self.last_value = await self.async_gen.__anext__()
            self.consumed_last = False
         except StopAsyncIteration:
            self.exhausted = True
            break

   async def stream(self):
      while not self.exhausted:
         if self.consumed_last:
            await asyncio.sleep(0.01)  # avoids block the loop
            continue
         self.consumed_last = True
         yield self.last_value

使用简单的生成器进行测试:

import asyncio
from random import uniform

async def numbers_stream(max_=100):
   next_int = -1
   while next_int < max_:
      next_int += 1
      yield next_int
      await asyncio.sleep(0.2)

async def main():
   gen = await RelaxedGenerator.start(numbers_stream)
   async for value in gen.stream():
      print(value, end=", ", flush=True)
      await asyncio.sleep(uniform(1, 2))

asyncio.run(main())

输出:

0, 6, 15, 21, 28, 38, 43, 48, 57, 65, 73, 81, 89, 96,

其他要记住的事情是,如果您要处理最后一个值,或者您正在使用的生成器是否会在实践中耗尽或不使用。 在这里,我假设您不关心最后一个值,并且生成器可能会耗尽。

您可以在生产者和消费者之间添加一个忘记旧结果的队列。 不幸的是,标准库中没有它的实现,但它几乎就在那里。 如果您检查asyncio.Queue的实现,您会注意到collections.deque的使用,请参阅https://github.com/python/cpython/blob/3.10/Lib/asyncio/queues.py#L49 collections.deque采用可选参数maxlen来丢弃以前添加的项目,请参阅https://docs.python.org/3/library/collections.html#collections.deque 利用它,我们可以创建自定义队列,它只保留最后 n 个项目。

import asyncio
import collections

class RollingQueue(asyncio.Queue):
    def _init(self, maxsize):
        self._queue = collections.deque(maxlen=maxsize)
        
    def full(self):
        return False

现在您可以按如下方式使用此队列:

async def numbers(nmax):
    for n in range(nmax):
        yield n
        await asyncio.sleep(0.3)
        
async def fill_queue(producer, queue):
    async for item in producer:
        queue.put_nowait(item)
    queue.put_nowait(None)

queue1 = RollingQueue(1)
numgen = numbers(10)
task = fill_queue(numgen, queue1)
asyncio.create_task(task)
while True:
    res = await queue1.get()
    if res is None:
        break
    print(res)
    await asyncio.sleep(1)

我将队列大小设置为 1 以保留问题中要求的最后一项。

结合使用提供的两个答案,我想出了以下似乎效果很好的解决方案:

import asyncio
import aiostream
import collections

class RollingQueue(asyncio.Queue):
    def _init(self, maxsize):
        self._queue = collections.deque(maxlen=maxsize)
        
    def full(self):
        return False

@aiostream.operator(pipable=True)
async def drop_stream(source, max_n=1):
    queue = RollingQueue(max_n)
    exhausted = False

    async def inner_task():
        async with aiostream.streamcontext(source) as streamer:
            async for item in streamer:
                queue.put_nowait(item)
        nonlocal exhausted
        exhausted = True

    task = asyncio.create_task(inner_task())
    try:
        while not exhausted:
            item = await queue.get()
            yield item
    finally:
        task.cancel()


async def main():

    xs = aiostream.stream.count(interval=0.2) | drop_stream.pipe(1) | aiostream.pipe.take(5)

    async with xs.stream() as stream:
        async for x in stream:
            print(x)
            await asyncio.sleep(1.0)


if __name__ == "__main__":
    asyncio.run(main())

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM