理解 asyncio.as_completed()

Question

想象一個像這樣的簡單程序，使用它我從 api 網關可用的多頁 json 數據中獲取數據字段。 （抱歉，我找不到支持分頁以使示例完全可重現的免費 json api。）

import asyncio
import aiohttp

async def fetch(url, params = None):
    async with aiohttp.ClientSession() as session:
        async with session.get(url, params) as response:
            return await response.json()

async def get_all_pages(base_url):
    def paginate(size=10**6):
        limit = 100
        offset = 0
        while offset <= size:
            yield {"offset": offset, "limit": limit}
            offset += limit
    total = (await fetch(base_url))["data"]["total"] # total number of pages
    coroutines = [fetch(base_url, params) for params in paginate(total)]
    print("total number of pages: {}, total number of coroutines: {}".format(total, len(coroutines))
    for routine in asyncio.as_completed(coroutines):
        r = await routine
        yield r["data"]["field"] #a field in the data for each page

async def main():
    url = "http://arandomurl.com"
    results = []
    async for x in get_all_pages(url):
        results.append(x)

    print(len(results)) #returns 1 -> only the first element is returned

asyncio.run(main())

問題是我的main function 中的 for 循環僅檢索我的生成器的第一個元素，不知何故生成器在發布第一個元素后停止。 這意味着as_completed沒有像我認為的那樣在def_get_all_pages中工作：發布完成的協程的結果，然后將其傳遞給yield r["data"]["field"] 。 線。 我怎樣才能正確地做到這一點？

Answer 1

這是我寫的一個測試程序。 我拿了問題中發布的代碼並替換了 function “fetch” 的內容以返回字典。 通過這個更改，我實際上可以運行該程序，並且它可以工作。 每 100 個“頁面”我會在“結果”中獲得一項。

import asyncio

async def fetch(_url, params = None):
    if params is None:
        return {"data": {"total": 169}}
    return {"data": {"field" : str(params)}}

async def get_all_pages(base_url):
    def paginate(size=10**6):
        limit = 100
        offset = 0
        while offset <= size:
            yield {"offset": offset, "limit": limit}
            offset += limit
    total = (await fetch(base_url))["data"]["total"] # total number of pages
    coroutines = [fetch(base_url, params) for params in paginate(total)]
    print("total number of pages: {}, total number of coroutines: {}".format(
        total, len(coroutines)))
    for routine in asyncio.as_completed(coroutines):
        r = await routine
        yield r["data"]["field"] #a field in the data for each page

async def main():
    url = "http://arandomurl.com"
    results = []
    async for x in get_all_pages(url):
        results.append(x)

    print(results) #returns 1 -> only the first element is returned

asyncio.run(main())

理解 asyncio.as_completed()

問題描述

1 個解決方案

解決方案1
1 2020-06-30 18:12:19

理解 asyncio.as_completed()

問題描述

1 個解決方案

解決方案1 1 2020-06-30 18:12:19

解決方案1
1 2020-06-30 18:12:19