簡體   English   中英

如何使用asyncio從異步for循環中產生?

[英]How to yield from an async for loop using asyncio?

我正在嘗試編寫一個簡單的異步數據批處理生成器,但在理解如何從異步for循環中產生麻煩。 在這里,我寫了一個簡單的課程來說明我的想法:

import asyncio
from typing import List

class AsyncSimpleIterator:
    def __init__(self, data: List[str], batch_size=None):
        self.data = data
        self.batch_size = batch_size
        self.doc2index = self.get_doc_ids()

    def get_doc_ids(self):
        return list(range(len(self.data)))

    async def get_batch_data(self, doc_ids):
        print("get_batch_data() running")
        page = [self.data[j] for j in doc_ids]
        return page

    async def get_docs(self, batch_size):
        print("get_docs() running")

        _batch_size = self.batch_size or batch_size
        batches = [self.doc2index[i:i + _batch_size] for i in
                   range(0, len(self.doc2index), _batch_size)]

        for _, doc_ids in enumerate(batches):
            docs = await self.get_batch_data(doc_ids)
            yield docs, doc_ids

    async def main(self):
        print("main() running")
        async for res in self.get_docs(batch_size=2):
            print(res)  # how to yield instead of print?

    def gen_batches(self):
        # how to get results of self.main() here?
        loop = asyncio.get_event_loop()
        loop.run_until_complete(self.main())
        loop.close()


 DATA = ["Hello, world!"] * 4
 iterator = AsyncSimpleIterator(DATA)
 iterator.gen_batches()

那么,我的問題是, 如何從main()生成結果以將其收集到gen_batches()

當我在main()打印結果時,我得到以下輸出:

main() running
get_docs() running
get_batch_data() running
(['Hello, world!', 'Hello, world!'], [0, 1])
get_batch_data() running
(['Hello, world!', 'Hello, world!'], [2, 3])

我正在嘗試編寫一個簡單的異步數據批處理生成器,但在理解如何從異步for循環中產生麻煩

async for產生像常規產量一樣的工作,除了它還必須由async for或等效收集。 例如,該yieldget_docs使得異步發電機。 如果在main() print(res)中用yield res替換print(res) ,它也會使main()成為異步生成器。

main()的生成器應該在gen_batches()gen_batches() ,所以我可以在gen_batches()收集所有結果

要收集異步生成器生成的值(例如main()print(res)替換為yield res ),您可以使用輔助協程:

def gen_batches(self):
    loop = asyncio.get_event_loop()
    async def collect():
        return [item async for item in self.main()]
    items = loop.run_until_complete(collect())
    loop.close()
    return items

collect()幫助器使用PEP 530異步理解,可以將其視為更明確的語法糖:

    async def collect():
        l = []
        async for item in self.main():
            l.append(item)
        return l

基於@ user4815162342的工作解決方案回答原始問題:

import asyncio
from typing import List


class AsyncSimpleIterator:

def __init__(self, data: List[str], batch_size=None):
    self.data = data
    self.batch_size = batch_size
    self.doc2index = self.get_doc_ids()

def get_doc_ids(self):
    return list(range(len(self.data)))

async def get_batch_data(self, doc_ids):
    print("get_batch_data() running")
    page = [self.data[j] for j in doc_ids]
    return page

async def get_docs(self, batch_size):
    print("get_docs() running")

    _batch_size = self.batch_size or batch_size
    batches = [self.doc2index[i:i + _batch_size] for i in
               range(0, len(self.doc2index), _batch_size)]

    for _, doc_ids in enumerate(batches):
        docs = await self.get_batch_data(doc_ids)
        yield docs, doc_ids

def gen_batches(self):
    loop = asyncio.get_event_loop()

    async def collect():
        return [j async for j in self.get_docs(batch_size=2)]

    items = loop.run_until_complete(collect())
    loop.close()
    return items


DATA = ["Hello, world!"] * 4
iterator = AsyncSimpleIterator(DATA)
result = iterator.gen_batches()
print(result)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM