简体   繁体   English

Python asyncio等待任务

[英]Python asyncio Await Tasks

Need: Python 3.7 or later. 需要:Python 3.7或更高版本。

Two functions main1 and main2 defined below. 下面定义两个函数main1main2 One create tasks and then await all of them at the end; 一个创建任务,然后在最后等待所有任务。 Another create and await each at a time. 另一个每次创建并等待每个。

While main1 take 2 seconds and main2 takes 30 seconds. main1需要2秒,而main2需要30秒。 Why? 为什么?

import asyncio

async def say_after(delay, what):
    await asyncio.sleep(delay)
    print(what)

async def main1():
    tasks = []
    for _ in range(10):
        task1 = asyncio.create_task(say_after(1, 'hello'))
        task2 = asyncio.create_task(say_after(2, 'world'))
        tasks.append(task1)
        tasks.append(task2)
    for x in tasks:
        await x

async def main2():
    for _ in range(10):
        await asyncio.create_task(say_after(1, 'hello'))
        await asyncio.create_task(say_after(2, 'world'))

asyncio.run(main2())

EDIT 1: 编辑1:

Here is a main3 version, which take 20 seconds. 这是main3版本,需要20秒。 I'd say the whole thing is just out of intuition :( 我会说整件事只是出于直觉:(

async def main3():
    for _ in range(10):
        task1 = asyncio.create_task(say_after(1, 'hello'))
        task2 = asyncio.create_task(say_after(2, 'world'))
        await task1
        await task2

EDIT 2: 编辑2:

(With some more sample code added below) I've read detailed answers from @freakish, I still stuck at one point: So only continuous await will corporately work in parallel (main4)? (下面添加了更多示例代码)我已经阅读了@freakish的详细答案,但我仍然停留在一点:所以只有持续await才能使公司并行工作(main4)?

Since create_task() takes no time (right?), why not both two await in main5 run in background so that main5 would took max time of (task1, task2)? 由于create_task()不需要任何时间(对吗?),为什么为什么不让main5两个awaitmain5运行,以便main5占用最大时间为(task1,task2)?

Is this await mechanism by design, or just a asyncio limitation (in design or in implementation)? 这是设计上的await机制,还是仅仅是asyncio限制(在设计或实现中)?

And any await detailed behaviors defined in official Python docs? 还有await官方Python文档中定义的详细行为吗?

# took 2 seconds
async def main4():
    task1 = asyncio.create_task(say_after(1, 'hello'))
    task2 = asyncio.create_task(say_after(2, 'world'))
    await task1
    await task2

# took 3 seconds
async def main5():
    task1 = asyncio.create_task(say_after(1, 'hello'))
    await task1
    task2 = asyncio.create_task(say_after(2, 'world'))
    await task2

Because main1 creates all tasks at the same time and then awaits all of them after they are created. 由于main1同时创建了所有的任务,然后等待他们在创建之后。 Everything happens in parallel. 一切并行进行。 And so total time is maximum of all times which is 2s. 因此,总时间最长为2s。

While main2 creates a new task only after previous one finishes. main2上一个任务完成main2创建新任务。 Everything happens sequentially. 一切顺序发生。 So total time is sum of all times which (judging from code) should be 30s. 因此,总时间是所有时间的总和 (从代码来看)应为30s。

Edit: say you have 3 tasks: task1, task2, task3 . 编辑:说您有3个任务: task1, task2, task3 If you do 如果你这样做

  1. create task1 创建任务1
  2. await task1 等待任务1
  3. create task2 创建任务2
  4. await task2 等待任务2
  5. create task3 创建任务3
  6. await task3 等待任务3

then the total execution time is obviously task1.time + task2.time + task3.time because there is no background processing. 那么总的执行时间显然是task1.time + task2.time + task3.time因为没有后台处理。 The flow is sequential. 该流程是顺序的。 Now lets say you do 现在说你做

  1. create task1 创建任务1
  2. create task2 创建任务2
  3. create task3 创建任务3
  4. await task1 等待任务1
  5. await task2 等待任务2
  6. await task3 等待任务3

Now task1, task2, task3 run in background . 现在task1, task2, task3 在后台运行 So it takes T1 = task1.time to process 4. But at pt 5 it takes T2 = max(task2.time - T1, 0) to process it beceause it already worked in background for T1 time. 因此,需要花费T1 = task1.time来处理4。但是在第5点,需要花费T2 = max(task2.time - T1, 0)来处理它,因为它已经在后台工作了T1时间。 At pt 6 it takes T3 = max(task3.time - T2 - T1, 0) to process it because it already worked in background for T1+T2 time. 在第6点,需要T3 = max(task3.time - T2 - T1, 0)来处理它,因为它已经在后台工作了T1+T2时间。 Now some maths is required to calculate that the sum of T1+T2+T3=max(task1.time, task2.time, task3.time) . 现在需要一些数学运算来计算T1+T2+T3=max(task1.time, task2.time, task3.time)

But the intuition is this: if taskX was the longest one and it finished then everything else finished due to parallel processing. 但是直觉是这样的:如果taskX是最长的,并且完成了,那么由于并行处理,其他所有操作都完成了。 So await returns immediatly making the total processing time maximum of all times. 因此, await返回立即使所有时间的总处理时间最大。

Side note: there are nuanses: this only works when you actually do parallelizable stuff like asyncio.sleep() . 旁注:有一些特殊之处:仅当您实际执行可并行处理的东西(例如asyncio.sleep()时,此选项才有效。 If those tasks are synchronous (say some cpu calculations) then both cases will give 30s. 如果这些任务是同步的(例如一些cpu计算),那么两种情况都将给出30s。

Edit2: So your main3 has a bit different flow. Edit2:所以您的main3流程有所不同。 It lets two tasks to run in parallel. 它使两个任务可以并行运行。 But no more: 但没有更多:

  1. create task1 创建任务1
  2. create task2 创建任务2
  3. await task1 等待任务1
  4. await task2 等待任务2
  5. create task3 创建任务3
  6. create task4 创建任务4
  7. await task3 等待任务3
  8. await task4 等待任务4

So this time task1 and task2 happen in parallel. 因此,这次task1task2并行发生。 But only after they are done, task3 and task4 can run. 但是只有完成后, task3task4才能运行。 In parallel. 在平行下。 So for each group the total time is maximum but you have to sum separate groups. 因此,对于每个组,总时间是最大的,但是您必须对各个组进行求和。 Ie the total execution time is max(task1.time, task2.time)+max(task3.time, task4.time) which in your case is 即总执行时间是max(task1.time, task2.time)+max(task3.time, task4.time)在您的情况下是

max(1,2) + ... + max(1,2) [10 times] = 20

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM