[英]Profiling python C extensions
I have developed a python C-extension that receives data from python and compute some cpu intensive calculations.我开发了一个 python C 扩展,它从 python 接收数据并计算一些 cpu 密集型计算。 It's possible to profile the C-extension?
可以分析 C 扩展吗?
The problem here is that writing a sample test in C to be profiled would be challenging because the code rely on particular inputs and data structures (generated by python control code).这里的问题是用 C 编写要分析的示例测试将具有挑战性,因为代码依赖于特定的输入和数据结构(由 python 控制代码生成)。
Do you have any suggestions?你有什么建议吗?
在 pygabriel 发表评论后,我决定将一个包上传到 pypi,该包使用来自 google-perftools 的 cpu-profiler 实现了 python 扩展的探查器: http ://pypi.python.org/pypi/yep
I've found my way using google-perftools .我找到了使用google-perftools 的方法。 The trick was to wrap the functions StartProfiler and StopProfiler in python (throught cython in my case).
诀窍是将函数 StartProfiler 和 StopProfiler 包装在 python 中(在我的例子中是通过 cython )。
To profile the C extension is sufficient to wrap the python code inside the StartProfiler and StopProfiler calls.分析 C 扩展足以将 Python 代码包装在 StartProfiler 和 StopProfiler 调用中。
from google_perftools_wrapped import StartProfiler, StopProfiler
import c_extension # extension to profile c_extension.so
StartProfiler("output.prof")
... calling the interesting functions from the C extension module ...
StopProfiler()
Then to analyze for example you can export in callgrind format and see the result in kcachegrind:然后以分析为例,您可以以 callgrind 格式导出并在 kcachegrind 中查看结果:
pprof --callgrind c_extension.so output.prof > output.callgrind
kcachegrind output.callgrind
One of my colleague told me ltrace(1)
.我的一位同事告诉我
ltrace(1)
。 It helped me on the same situation quite a lot.在同样的情况下,它对我帮助很大。
Assume the shared object name of your C extention is myext.so
and you want to execute benchmark.py
, then假设你的 C 扩展的共享对象名称是
myext.so
并且你想要执行benchmark.py
,然后
ltrace -x @myext.so -c python benchmark.py
Its output is like它的输出就像
% time seconds usecs/call calls function
------ ----------- ----------- --------- --------------------
24.88 30.202126 7550531 4 ldap_result
12.46 15.117625 7558812 2 l_ldap_result4
12.41 15.059652 5019884 3 ldap_chase_v3referrals
12.41 15.057678 3764419 4 ldap_new_connection
12.40 15.050310 3762577 4 ldap_int_open_connection
12.39 15.042360 3008472 5 ldap_send_server_request
12.38 15.029055 3757263 4 ldap_connect_to_host
0.05 0.057890 28945 2 ldap_get_option
0.04 0.052182 26091 2 ldap_sasl_bind
0.03 0.030760 30760 1 l_ldap_get_option
0.03 0.030635 30635 1 LDAP_get_option
0.02 0.029960 14980 2 ldap_initialize
0.02 0.027988 27988 1 ldap_int_initialize
0.02 0.026722 26722 1 l_ldap_simple_bind
0.02 0.026386 13193 2 ldap_send_initial_request
0.02 0.025810 12905 2 ldap_int_select
....
Special care is needed if your shared object has -
or +
in its file name.如果您的共享对象的文件名中有
-
或+
,则需要特别小心。 These characters aren't treated as is (see man 1 ltrace
for details).这些字符不会按原样处理(有关详细信息,请参阅
man 1 ltrace
)。
The wildcard *
can be a workaround such as -x @myext*
in place of -x @myext-2.so
.通配符
*
可以是一种解决方法,例如-x @myext*
代替-x @myext-2.so
。
With gprof , you can profile any program that was properly compiled and linked ( gcc -pg
etc, in gprof
's case).使用gprof ,您可以分析任何正确编译和链接的程序(
gcc -pg
等,在gprof
的情况下)。 If you're using a Python version not built with gcc
(eg, the Windows precompiled version the PSF distributes), you'll need to research what equivalent tools exist for that platform and toolchain (in the Windows PSF case, maybe mingw
can help).如果您使用的 Python 版本不是用
gcc
构建的(例如,PSF 分发的 Windows 预编译版本),您将需要研究该平台和工具链存在哪些等效工具(在 Windows PSF 情况下,也许mingw
可以提供帮助)。 There may be "irrelevant" data there (internal C functions in the Python runtime), and, if so, the percentages shown by gprof
may not be applicable -- but the absolute numbers (of calls, and durations thereof) are still valid, and you can post-process gprof
's output (eg, with a little Python script;-) to exclude the irrelevant data and compute the percentages you want.那里可能有“不相关”的数据(Python 运行时中的内部 C 函数),如果是这样,
gprof
显示的百分比可能不适用——但绝对数字(调用次数及其持续时间)仍然有效,并且您可以对gprof
的输出进行后处理(例如,使用一个小的 Python 脚本;-)以排除不相关的数据并计算您想要的百分比。
I found py-spy very easy to use.我发现py-spy非常易于使用。 See this blog post for an explanation of its native extension support.
有关其原生扩展支持的说明,请参阅此博客文章。
Highlights:强调:
--format speedscope
)--format speedscope
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.