簡體   English   中英

理解 asyncio.as_completed()

[英]Making sense of asyncio.as_completed()

想象一個像這樣的簡單程序,使用它我從 api 網關可用的多頁 json 數據中獲取數據字段。 (抱歉,我找不到支持分頁以使示例完全可重現的免費 json api。)

import asyncio
import aiohttp

async def fetch(url, params = None):
    async with aiohttp.ClientSession() as session:
        async with session.get(url, params) as response:
            return await response.json()

async def get_all_pages(base_url):
    def paginate(size=10**6):
        limit = 100
        offset = 0
        while offset <= size:
            yield {"offset": offset, "limit": limit}
            offset += limit
    total = (await fetch(base_url))["data"]["total"] # total number of pages
    coroutines = [fetch(base_url, params) for params in paginate(total)]
    print("total number of pages: {}, total number of coroutines: {}".format(total, len(coroutines))
    for routine in asyncio.as_completed(coroutines):
        r = await routine
        yield r["data"]["field"] #a field in the data for each page

async def main():
    url = "http://arandomurl.com"
    results = []
    async for x in get_all_pages(url):
        results.append(x)

    print(len(results)) #returns 1 -> only the first element is returned

asyncio.run(main())

問題是我的main function 中的 for 循環僅檢索我的生成器的第一個元素,不知何故生成器在發布第一個元素后停止。 這意味着as_completed沒有像我認為的那樣在def_get_all_pages中工作:發布完成的協程的結果,然后將其傳遞給yield r["data"]["field"] 線。 我怎樣才能正確地做到這一點?

這是我寫的一個測試程序。 我拿了問題中發布的代碼並替換了 function “fetch” 的內容以返回字典。 通過這個更改,我實際上可以運行該程序,並且它可以工作。 每 100 個“頁面”我會在“結果”中獲得一項。

import asyncio

async def fetch(_url, params = None):
    if params is None:
        return {"data": {"total": 169}}
    return {"data": {"field" : str(params)}}

async def get_all_pages(base_url):
    def paginate(size=10**6):
        limit = 100
        offset = 0
        while offset <= size:
            yield {"offset": offset, "limit": limit}
            offset += limit
    total = (await fetch(base_url))["data"]["total"] # total number of pages
    coroutines = [fetch(base_url, params) for params in paginate(total)]
    print("total number of pages: {}, total number of coroutines: {}".format(
        total, len(coroutines)))
    for routine in asyncio.as_completed(coroutines):
        r = await routine
        yield r["data"]["field"] #a field in the data for each page

async def main():
    url = "http://arandomurl.com"
    results = []
    async for x in get_all_pages(url):
        results.append(x)

    print(results) #returns 1 -> only the first element is returned

asyncio.run(main())

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM