简体   繁体   English

asyncio.gather(*coroutines) 没有使用正确的负载

[英]asyncio.gather(*coroutines) is not using the right payload

I am new to AsyncIO in Python and i stumbled about a strange behaviour of asyncio.gather(*coroutines) .我是 Python 中的 AsyncIO 新手,我偶然发现了asyncio.gather(*coroutines)的奇怪行为。

In my project a want to request an API and download one report per day.在我的项目中,我想请求一个 API 并每天下载一份报告。 And i want to do parallel requests with asyncio.我想用 asyncio 做并行请求。 After that i want to combine the results.之后我想合并结果。 I also do some request polling - but anyways: I tried to break down my code to the most simplest version to show you the issue我也做了一些请求轮询 - 但无论如何:我试图将我的代码分解为最简单的版本以向您展示问题
(including test output; using Python 3.8) (包括测试输出;使用 Python 3.8)


import asyncio

class ReportingService:

    async def _poll_and_get_report(self, payload:dict):

        print(2, payload)

        return [] # just a dummy for illustration

    def _get_report_for_dates(self, payload):

        dates = [
            '20200705',
            '20200706',
            '20200707',
            '20200708',
            '20200709'
        ]

        async def _(payload:dict, dates:list):
            coroutines = []
            for date in dates:
                payload['reportDate'] = date
                print(1, payload)
                coroutines.append(self._poll_and_get_report(payload))

            results =  await asyncio.gather(*coroutines)
            return results

        response_json = []
        results = asyncio.run( _(payload, dates) )
        for json_data in results:
            response_json.extend(json_data)

        return response_json


service = ReportingService()
service._get_report_for_dates({'foo': 'bar'})

This is the output i expect:这是我期望的输出:

1 {'foo': 'bar', 'reportDate': '20200705'}
1 {'foo': 'bar', 'reportDate': '20200706'}
1 {'foo': 'bar', 'reportDate': '20200707'}
1 {'foo': 'bar', 'reportDate': '20200708'}
1 {'foo': 'bar', 'reportDate': '20200709'}
2 {'foo': 'bar', 'reportDate': '20200705'}
2 {'foo': 'bar', 'reportDate': '20200706'}
2 {'foo': 'bar', 'reportDate': '20200707'}
2 {'foo': 'bar', 'reportDate': '20200708'}
2 {'foo': 'bar', 'reportDate': '20200709'}

But this is the output i actually get:但这是我实际得到的输出:

1 {'foo': 'bar', 'reportDate': '20200705'}
1 {'foo': 'bar', 'reportDate': '20200706'}
1 {'foo': 'bar', 'reportDate': '20200707'}
1 {'foo': 'bar', 'reportDate': '20200708'}
1 {'foo': 'bar', 'reportDate': '20200709'}
2 {'foo': 'bar', 'reportDate': '20200709'}  <-- WTF???
2 {'foo': 'bar', 'reportDate': '20200709'}
2 {'foo': 'bar', 'reportDate': '20200709'}
2 {'foo': 'bar', 'reportDate': '20200709'}
2 {'foo': 'bar', 'reportDate': '20200709'}

This is superconfusing and annoying to me.这对我来说是超级困惑和烦人的。 Currently completly blocking the progress of a project.目前完全阻止了一个项目的进展。 I am sure i am missunderstanding something fundamentally on using coroutines.我确信我从根本上误解了使用协程的一些东西。 Maybe payload data is not intended to use at all with coroutines - but what would be a working alternative to it?也许有效载荷数据根本不打算与协程一起使用 - 但它的工作替代品是什么?

Your problem is that you're passing the same dictionary object into each call to _poll_and_get_report .您的问题是您将相同的字典对象传递到对_poll_and_get_report每次调用中。 This doesn't actually have anything to do with asyncio;这实际上与 asyncio 没有任何关系; it just seems like it does because the execution of _poll_and_get_report is deferred until you await gather .似乎确实如此,因为_poll_and_get_report的执行被推迟到您 await gather

You need to either split the date out into a separate argument or pass a new dictionary in each call.您需要将日期拆分为单独的参数或在每次调用中传递一个新字典。 The simplest way to do this is最简单的方法是

async def _(payload:dict, dates:list):
    coroutines = []
    for date in dates:
        payload['reportDate'] = date
        print(1, payload)
        coroutines.append(self._poll_and_get_report({**payload}))

    results =  await asyncio.gather(*coroutines)
    return results

If you need payload to be preserved for whatever is calling _get_report_for_dates , you'll only want to set reportDate in the new dictionary.如果您需要为调用_get_report_for_dates任何内容保留payload ,您只需在新字典中设置reportDate

async def _(payload:dict, dates:list):
    coroutines = []
    for date in dates:
        unique_payload = {**payload, **{"reportDate": date}}
        print(1, new_payload)
        coroutines.append(self._poll_and_get_report(new_payload))

    results =  await asyncio.gather(*coroutines)
    return results

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM