如何衡量Python的asyncio代码性能？

Question

I can't use normal tools and technics to measure the performance of a coroutine because the time it takes at await should not be taken in consideration (or it should just consider the overhead of reading from the awaitable but not the IO latency). 我无法使用常规工具和技术来衡量协同程序的性能，因为不应该考虑await时间（或者它应该考虑从等待但不是IO延迟读取的开销）。

So how do measure the time a coroutine takes ? 那么如何衡量协程所需的时间呢？ How do I compare 2 implementations and find the more efficent ? 我如何比较2个实现并找到更高效的？ What tools do I use ? 我使用什么工具？

Answer 1

This answer originally contained two different solutions: the first one was based on monkey-patching and the second one does not work for python 3.7 and onward. 这个答案最初包含两个不同的解决方案：第一个解决方案基于猴子修补，第二个解决方案不适用于python 3.7及更高版本。 This new version hopefully presents a better, more robust approach. 这个新版本有望呈现更好，更强大的方法。

First off, standard timing tools such as time can be used to determine the CPU time of a program, which is usually what we're interested in when testing the performance of an asynchronous application. 首先，标准计时工具（如时间）可用于确定程序的CPU时间，这通常是我们在测试异步应用程序性能时感兴趣的。 Those measurements can also be performed in python using the time.process_time() function: 这些测量也可以使用time.process_time（）函数在python中执行：

import time

real_time = time.time()
cpu_time = time.process_time()

time.sleep(1.)
sum(range(10**6))

real_time = time.time() - real_time
cpu_time = time.process_time() - cpu_time

print(f"CPU time: {cpu_time:.2f} s, Real time: {real_time:.2f} s")

See below the similar output produced by both methods: 请参阅下面两种方法产生的类似输出：

$ /usr/bin/time -f "CPU time: %U s, Real time: %e s" python demo.py
CPU time: 0.02 s, Real time: 1.02 s  # python output
CPU time: 0.03 s, Real time: 1.04 s  # `time` output

In an asyncio application, it might happen that some synchronous part of the program ends up performing a blocking call, effectively preventing the event loop from running other tasks. 在asyncio应用程序中，可能会发生程序的某些同步部分最终执行阻塞调用，从而有效地阻止事件循环运行其他任务。 So we might want to record separately the time the event loop spends waiting from the time taken by other IO tasks. 因此，我们可能希望单独记录事件循环从其他IO任务所花费的时间等待的时间。

This can be achieved by subclassing the default selector to perform some timing operation and using a custom event loop policy to set everything up. 这可以通过继承默认选择器来执行某些计时操作并使用自定义事件循环策略来设置所有内容来实现。 This code snippet provides such a policy along with a context manager for printing different time metrics. 此代码段提供了此类策略以及用于打印不同时间度量标准的上下文管理器。

async def main():
    print("~ Correct IO management ~")
    with print_timing():
        await asyncio.sleep(1)
        sum(range(10**6))
    print()

    print("~ Incorrect IO management ~")
    with print_timing():
        time.sleep(0.2)
        await asyncio.sleep(0.8)
        sum(range(10**6))
    print()

asyncio.set_event_loop_policy(TimedEventLoopPolicy())
asyncio.run(main(), debug=True)

Note the difference between those two runs: 注意这两个运行之间的区别：

~ Correct IO management ~
CPU time:      0.016 s
Select time:   1.001 s
Other IO time: 0.000 s
Real time:     1.017 s

~ Incorrect IO management ~
CPU time:      0.016 s
Select time:   0.800 s
Other IO time: 0.200 s
Real time:     1.017 s

Also notice that the asyncio debug mode can detect those blocking operations: 另请注意， asyncio调试模式可以检测到这些阻塞操作：

Executing <Handle <TaskWakeupMethWrapper object at 0x7fd4835864f8>(<Future finis...events.py:396>) created at ~/miniconda/lib/python3.7/asyncio/futures.py:288> took 0.243 seconds

Answer 2

If you only want to measure performance of "your" code, you could used approach similar to unit testing - just monkey-patch (even patch + Mock) the nearest IO coroutine with Future of expected result. 如果你只想测量“你的”代码的性能，你可以使用类似于单元测试的方法 - 只需猴子补丁（甚至补丁+模拟）最接近的IO协程与期望结果的未来。

The main drawback is that eg http client is fairly simple, but let's say momoko (pg client)... it could be hard to do without knowing its internals, it won't include library overhead. 主要的缺点是，例如http客户端相当简单，但是让我们说momoko（pg客户端）......如果不知道它的内部结构可能很难做到，它不会包含库开销。

The pro are just like in ordinary testing: 专业人士就像在普通测试中一样：

it's easy to implement, 它很容易实现，
it measures something ;), mostly one's implementation without overhead of third party libraries, 它测量的东西;），主要是一个没有第三方库开销的实现，
performance tests are isolated, easy to re-run, 性能测试是孤立的，易于重新运行，
it's to run with many payloads 这是运行许多有效载荷

如何衡量Python的asyncio代码性能？

问题描述

2 个解决方案

解决方案1
14 已采纳 2016-01-16 13:04:48

解决方案2
0 2016-01-17 14:37:03

如何衡量Python的asyncio代码性能？

问题描述

2 个解决方案

解决方案1 14 已采纳 2016-01-16 13:04:48

解决方案2 0 2016-01-17 14:37:03

解决方案1
14 已采纳 2016-01-16 13:04:48

解决方案2
0 2016-01-17 14:37:03