简体   繁体   English

在另一个 asyncio.gather() 中使用嵌套的 asyncio.gather()

[英]Using nested asyncio.gather() inside another asyncio.gather()

I have a class with various methods.我有一个具有各种方法的课程。 I have a method in that class something like :我在该类中有一个方法,例如:

 class MyClass:

    async def master_method(self):
      tasks = [self.sub_method() for _ in range(10)]
      results = await asyncio.gather(*tasks)

    async def sub_method(self):
      subtasks = [self.my_task() for _ in range(10)]
      results = await asyncio.gather(*subtasks)

    async def my_task(self):
      return "task done"  

So the question here is:所以这里的问题是:

  1. Are there any issues, advantages/disadvantages with using asyncio.gather() inside co-routines that are being called from another asyncio.gather() ?在从另一个asyncio.gather()调用的协程中使用asyncio.gather()是否有任何问题、优点/缺点? Any performance issues?任何性能问题?

  2. Are all tasks in all levels treated with the same priority by asyncio loop? asyncio循环是否以相同的优先级处理所有级别的所有任务? Would this give the same performance as if I have called all the co-routines with a single asyncio.gather() from the master_method ?这会给出相同的性能,如果我呼吁所有的协程从单一asyncio.gather() master_method

TLDR: Using gather instead of returning tasks simplifies usage and makes code easier to maintain. TLDR:使用gather而不是返回任务简化了使用并使代码更易于维护。 While gather has some overhead, it is negligible for any practical application.虽然gather有一些开销,但对于任何实际应用来说都是可以忽略不计的。


Why gather ?为什么要gather

The point of gather to accumulate child tasks before exiting a coroutine is to delay the completion of the coroutine until its child tasks are done.在退出协程之前gather子任务的要点是延迟协程的完成,直到其子任务完成。 This encapsulates the implementation, and ensures that the coroutine appears as one single entity "doing its thing".封装了实现,并确保协程作为一个单一的实体“做它的事”。
The alternative is to return the child tasks, and expect the caller to run them to completion.另一种方法是return子任务,并期望调用者将它们运行到完成。

For simplicity, let's look at a single layer – corresponding to the intermediate sub_method – but in different variations.为简单起见,让我们看一个单层——对应于中间的sub_method但有不同的变化。

async def child(i):
    await asyncio.sleep(0.2)  # some non-trivial payload
    print("child", i, "done")

async def encapsulated() -> None:
    await asyncio.sleep(0.1)  # some preparation work
    children = [child() for _ in range(10)]
    await asyncio.gather(*children)

async def task_children() -> 'List[asyncio.Task]':
    await asyncio.sleep(0.1)  # some preparation work
    children = [asyncio.create_task(child()) for _ in range(10)]
    return children

async def coro_children() -> 'List[Awaitable[None]]':
    await asyncio.sleep(0.1)  # some preparation work
    children = [child() for _ in range(10)]
    return children

All of encapsulated , task_children and coro_children in some way encode that there are sub-tasks.所有encapsulatedtask_childrencoro_children以某种方式编码存在子任务。 This allows the caller to run them in such a way that the actual goal is "done" reliably.这允许调用者以可靠地“完成”实际目标的方式运行它们。 However, each variant differs in how much it does by itself and how much the caller has to do:但是,每个变体的不同之处在于它自己做了多少以及调用者必须做多少:

  • The encapsulated is the "heaviest" variant: all children are run in Task s and there is an additional gather . encapsulated是“最重”的变体:所有子项都在Task中运行,并且有一个额外的gather However, the caller is not exposed to any of this:但是,调用者不会暴露于任何以下内容:
     await encapsulated()
    This guarantees that the functionality works as intended, and its implementation can freely be changed.这保证了功能按预期工作,并且可以自由更改其实现。
  • The task_children is the intermediate variant: all children are run in Task s. task_children是中间变体:所有孩子都在Task中运行。 The caller can decide if and how to wait for completion:调用者可以决定是否以及如何等待完成:
     tasks = await task_children() await asyncio.gather(*tasks) # can add other tasks here as well
    This guarantees that the functionality starts as intended.这保证了功能按预期启动 Its completion relies on the caller having some knowledge, though.但是,它的完成依赖于调用者具有一些知识。
  • The coro_children is the "lightest" variant: nothing of the children is actually run. coro_children是“最轻的”变体:实际上没有任何孩子在运行。 The caller is responsible for the entire lifetime:调用者负责整个生命周期:
     tasks = await coro_children() # children don't actually run yet! await asyncio.gather(*tasks) # can add other tasks here as well
    This completely relies on the caller to start and wait for the sub-tasks.这完全依赖于调用者来启动和等待子任务。

Using the encapsulated pattern is a safe default – it ensures that the coroutine "just works".使用encapsulated模式是一个安全的默认设置——它确保协程“正常工作”。 Notably, a coroutine using an internal gather still appears like any other coroutine.值得注意的是,使用内部gather的协程仍然看起来像任何其他协程。

Speed gather ?速度gather

The gather utility a) ensures that its arguments are run as Task s and b) provides a Future that triggers once the tasks are done. gather实用程序 a) 确保其参数作为Task运行,并且 b) 提供一个Future ,一旦任务完成就会触发。 Since gather is usually used when one would run the arguments as Task s anyway, there is no additional overhead from this;由于gather通常是用来当一个人将运行参数作为Task小号反正有从这个没有额外的开销; likewise, these are regular Task s and have the same performance/priority characteristics¹ as everything else.同样,这些是常规Task并且与其他所有内容具有相同的性能/优先级特征¹。

The only overhead is from the wrapping Future ;唯一的开销来自包装Future this takes care of bookkeeping (ensuring the arguments are tasks) and then only waits, ie does nothing.这负责记账(确保参数是任务),然后只等待,即什么都不做。 On my machine, measuring the overhead shows that it takes on average about twice as long as running a no-op Task .在我的机器上, 测量开销表明它平均花费的时间是运行 no-op Task两倍。 This by itself should already be negligible for any real-world task.对于任何现实世界的任务,这本身应该已经可以忽略不计。

In addition, the pattern of gather ing child tasks inherently means that there is a tree of gather nodes.此外, gather子任务的模式本质上意味着存在一个gather节点 Thus the number of gather nodes is usually much lower than the number of tasks.因此, gather节点的数量通常远低于任务数量。 For example, for the case of 10 tasks per gather , a total of only 11 gather s is needed to handle a total of 100 tasks.例如,对于每个gather 10 个任务的情况,总共只需要 11 个gather即可处理总共 100 个任务。

master_method                                                  0

sub_method         0          1          2          3          4          5 ...

my_task       0123456789 0123456789 0123456789 0123456789 0123456789 0123456789 ...

¹Which is to say, none. ¹也就是说,没有。 asyncio currently has no concept of Task priorities. asyncio目前没有Task优先级的概念。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM