简体   繁体   English

Python 异步和线程不能并行工作

[英]Python asyncio and threading are not working parallel

I have the following code below, written for testing purposes.我有以下代码,用于测试目的。

It is just a function to call "5000x factorial(900)" and print the output调用“5000x factorial(900)”并打印 output 只是一个 function

It doesn't matter if I use threading or async, they always run one function after the other one, never in parallel.不管我使用线程还是异步,它们总是在另一个之后运行一个 function,从不并行。

First one with asyncio:第一个使用 asyncio:

import asyncio
async def factorial(name, number):
    def fatorial(n):
        if n == 1:
            return 1
        else:
            return n * fatorial(n - 1)
    print(f"START: Task {name}: factorial({number})")
    for i in range(5000):
        var = fatorial(number)
    print(f"FIM: Task {name}: factorial({number})")
    return var

async def main():
    task1 = asyncio.ensure_future(factorial("A", 900))
    task2 = asyncio.ensure_future(factorial("B", 900))
    task3 = asyncio.ensure_future(factorial("C", 900))
    task4 = asyncio.ensure_future(factorial("D", 900))
    await asyncio.gather(task1, task2, task3, task4)


asyncio.run(main())

also tried:也试过:

async def main():
    # Schedule three calls *concurrently*:
    task1 = asyncio.create_task(factorial("A", 900))
    task2 = asyncio.create_task(factorial("B", 900))
    task3 = asyncio.create_task(factorial("C", 900))
    task4 = asyncio.create_task(factorial("D", 900))
    await task4


asyncio.run(main())

and also tried with threading:并尝试使用线程:

import threading
def factorial(name, number):
    def fatorial(n):
        if n == 1:
            return 1
        else:
            return n * fatorial(n - 1)
    print(f"START: Task {name}: factorial({number})")
    for i in range(5000):
        var = fatorial(number)
    print(f"FIM: Task {name}: factorial({number})")


threading.Thread(target=factorial("A", 900), daemon=True).start()
threading.Thread(target=factorial("B", 900), daemon=True).start()
threading.Thread(target=factorial("C", 900), daemon=True).start()
threading.Thread(target=factorial("D", 900), daemon=True).start()

and the output is always the same:并且 output 始终相同:

START: Task A: factorial(900)
FIM: Task A: factorial(900)
START: Task B: factorial(900)
FIM: Task B: factorial(900)
START: Task C: factorial(900)
FIM: Task C: factorial(900)
START: Task D: factorial(900)
FIM: Task D: factorial(900)

As said in the comments, none of these are good for CPU bound work - but yes, they can run somewhat in parallel - it will just take the amount of time, or a little bit extra, than running then in sequence.正如评论中所说,这些都不适用于 CPU 密集型工作 - 但是是的,它们可以在一定程度上并行运行 - 与按顺序运行相比,它只需要花费一些时间,或者多一点。 Your code is just incorrect in both cases您的代码在这两种情况下都不正确

For the asyncio bit: asyncio just stop one task to run another when it meets an await instruction or equivalent.对于 asyncio 位:asyncio 只是在遇到await指令或等效指令时停止一个任务以运行另一个任务。 With no "awaits" your code just run straight until completion, with no chance for task-switching.没有“等待”,您的代码直接运行直到完成,没有任务切换的机会。

await s occur somewht naturally in code that would be optimal to be run as async, as they are placed before the calls to I/O which should take sometime (you await queries to the db server, or an http request, for example). await会自然地出现在最适合作为异步运行的代码中,因为它们被放置在对 I/O 的调用之前,这应该需要一些时间(例如,您等待对数据库服务器的查询,或 http 请求)。 In a CPU bound loop like this, there is nothing to await for, so, if you want to be nice and have other code run in parallel, you have to introduce those "holes".在像这样的 CPU 绑定循环中,没有什么可等待的,所以,如果你想要更好地让其他代码并行运行,你必须引入那些“漏洞”。 One way of doing it is awaiting for a call to asyncio.sleep .一种方法是等待调用asyncio.sleep If you put one such call at the end of your inner factorial, you should see it running in parallel:如果您在内部阶乘的末尾放置一个这样的调用,您应该会看到它并行运行:

import asyncio
async def factorial(name, number):
    async def fatorial(n):
        await asyncio.sleep(0)
        if n == 1:
            return 1
        else:
            return n * fatorial(n - 1)
    print(f"START: Task {name}: factorial({number})")
    for i in range(5000):
        var = await fatorial(number)
    print(f"FIM: Task {name}: factorial({number})")
    return var



On the threaded case, there is a different mistake.在带螺纹的情况下,有一个不同的错误。 Unlike async def declared functions, when you are creating a thread with a function as a target you must not call the function.async def声明的函数不同,当您创建一个以 function 作为目标的线程时,您不能调用function。 You just pass the function, as an object, to the thread constructor, and the arguments to each separately.您只需将 function 作为 object 传递给线程构造函数,并将 arguments 分别传递给每个。 The actual function call will take place inside the thread.实际的 function 调用将在线程内进行。 The way you did it, you just called all functions eagerly, before the call to create each thread even takes place (Everything inside the parenthesis in the call to threading.Thread(...) have to be executed before the call even takes place.你这样做的方式,你只是在调用创建每个线程之前急切地调用了所有函数(调用threading.Thread(...)括号内的所有内容都必须在调用之前执行.

The behavior for "async def functions" (the proper name is "coroutine function") is different: the call syntax (eg factorial() ) resolves but this do not run any code - rather, the call do a corotine function yields a coroutine object - which then can be awaited or wrapped in a task or future: only them the code in the function body will actually be executed. “async def functions”(正确的名称是“coroutine function”)的行为是不同的:调用语法(例如factorial() )解析,但这不运行任何代码 - 相反,调用执行 corotine function 产生协程object - 然后可以等待或包装在任务或未来中:只有他们实际执行 function 主体中的代码。

So, for the threaded code, the changes are these:因此,对于线程代码,更改如下:


threading.Thread(target=factorial, args=("A", 900), daemon=True).start()
threading.Thread(target=factorial, args=("B", 900), daemon=True).start()
threading.Thread(target=factorial, args=("C", 900), daemon=True).start()
threading.Thread(target=factorial, args=("D", 900), daemon=True).start()

Now, if you have a multicore machine, you can change your threading example for a multiprocessing example, take the same care when creating the Process instances, and you should see your execution time fall proportionally to the number of physical CPU cores you have.现在,如果您有一台多核机器,您可以将线程示例更改为多处理示例,在创建Process实例时同样小心,您应该会看到执行时间与您拥有的物理 CPU 内核数量成正比。 (up to 4, as you are creating only 4 parallel tasks) (最多 4 个,因为您只创建 4 个并行任务)

If you simplify the inner function it can be easier to see what is happening.如果您简化内部 function 可以更容易地看到发生了什么。 Like this.像这样。

import asyncio
import time

async def factorial(name, number):
    print(f"START: Task {name}: factorial({number})")
    if True:  # or False - change manually
        await asyncio.sleep(1.0)
    else:
        time.sleep(1.0)
    print(f"FIM: Task {name}: factorial({number})")

async def main():
    task1 = asyncio.ensure_future(factorial("A", 900))
    task2 = asyncio.ensure_future(factorial("B", 900))
    task3 = asyncio.ensure_future(factorial("C", 900))
    task4 = asyncio.ensure_future(factorial("D", 900))
    await asyncio.gather(task1, task2, task3, task4)

asyncio.run(main())

When True (and not CPU bound)True时(而不是 CPU 限制)

START: Task A: factorial(900)
START: Task B: factorial(900)
START: Task C: factorial(900)
START: Task D: factorial(900)
FIM: Task A: factorial(900)
FIM: Task C: factorial(900)
FIM: Task B: factorial(900)
FIM: Task D: factorial(900)

When False (also not CPU bound)False时(也不受 CPU 限制)

START: Task A: factorial(900)
FIM: Task A: factorial(900)
START: Task B: factorial(900)
FIM: Task B: factorial(900)
START: Task C: factorial(900)
FIM: Task C: factorial(900)
START: Task D: factorial(900)
FIM: Task D: factorial(900)

The difference is that awaiting the asyncio.sleep hands back control to the event loop.不同之处在于等待 asyncio.sleep 将控制权交还给事件循环。 Calling time.sleep does not.调用 time.sleep 没有。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM