简体   繁体   English

使用Python / C接口而不是Cython是否有优势?

[英]Are there advantages to use the Python/C interface instead of Cython?

I want to extend python and numpy by writing some modules in C or C++, using BLAS and LAPACK. 我想通过使用BLAS和LAPACK在C或C ++中编写一些模块来扩展python和numpy。 I also want to be able to distribute the code as standalone C/C++ libraries. 我还希望能够将代码作为独立的C / C ++库进行分发。 I would like this libraries to use both single and double precision float. 我希望这个库使用单精度和双精度浮点数。 Some examples of functions I will write are conjugate gradient for solving linear systems or accelerated first order methods. 我将编写的函数的一些示例是用于求解线性系统或加速一阶方法的共轭梯度。 Some functions will need to call a Python function from the C/C++ code. 有些函数需要从C / C ++代码调用Python函数。

After playing a little with the Python/C API and the Numpy/C API, I discovered that many people advocate the use of Cython instead (see for example this question or this one ). 在使用Python / C API和Numpy / C API稍微玩了一下之后,我发现许多人主张使用Cython(例如参见这个问题或者这个 问题 )。 I am not an expert about Cython, but it seems that for some cases , you still need to use the Numpy/C API and know how it works. 我不是Cython的专家,但似乎在某些情况下 ,你仍然需要使用Numpy / C API并知道它是如何工作的。 Given the fact that I already have (some little) knowledge about the Python/C API and none about Cython, I was wondering if it makes sense to keep on using the Python/C API, and if using this API has some advantages over Cython. 鉴于我已经拥有(一些)关于Python / C API的知识而没有关于Cython的知识,我想知道继续使用Python / C API是否有意义,并且如果使用此API比Cython有一些优势。 In the future, I will certainly develop some stuff not involving numerical computing, so this question is not only about numpy. 在未来,我肯定会开发一些不涉及数值计算的东西,所以这个问题不仅仅是关于numpy。 One of the thing I like about the Python/C API is the fact that I learn some stuff about how the Python interpreter is working. 我喜欢Python / C API的一个原因是我学到了一些关于Python解释器如何工作的东西。

Thanks. 谢谢。

The current "top answer" sounds a bit too much like FUD in my ears. 目前的“最佳答案”听起来有点像我耳朵里的FUD。 For one, it is not immediately obvious that the Average Developer would write faster code in C than what NumPy+Cython gives you anyway. 首先,平均开发人员在C中编写的代码比NumPy + Cython给你的代码要快得多。 Quite the contrary, the time it takes to even get the necessary C code to work correctly in a Python environment is usually much better invested in writing a quick prototype in Cython, benchmarking it, optimising it, rewriting it in a faster way, benchmarking it again, and then deciding if there is anything in it that truly requires the 5-10% more performance that you may or may not get from rewriting 2% of the code in hand-tuned C and calling it from your Cython code. 恰恰相反,在Python环境中获得必要的C代码以正常工作所需的时间通常更好地投入到在Cython中编写快速原型,对其进行基准测试,优化它,以更快的方式重写它,对其进行基准测试再次, 然后决定是否有任何东西确实需要5-10%的性能,你可能会或可能不会从手动调整的C重写2%的代码并从你的Cython代码调用它。

I'm writing a library in Cython that currently has about 18K lines of Cython code, which translate to almost 200K lines of C code. 我正在Cython中编写一个库,目前有大约18K行的Cython代码,它可以转换为近200K行的C代码。 I once managed to get a speed-up of almost 25% for a couple of very important internal base level functions, by injecting some 20 lines of hand-tuned C code in the right places. 通过在正确的位置注入大约20行手动调整的C代码,我曾经设法通过几个非常重要的内部基本级功能加速了近25%的速度。 It took me a couple of hours to rewrite and optimise this tiny part. 我花了几个小时来重写和优化这个小部件。 That's truly nothing compared to the huge amount of time I saved by not writing (and having to maintain) the library in plain C in the first place. 与我节省的大量时间相比,这一点真的没有任何意义,因为首先没有在普通C中编写(并且必须维护)库。

Even if you know C a lot better than Cython, if you know Python and C, you will learn Cython so quickly that it's worth the investment in any case, especially when you are into numerics. 即使您比Cython更了解C,如果您了解Python C,您将很快学会Cython,无论如何都值得投资,特别是当您进入数字时。 80-95% of the code you write will benefit so much from being written in a high-level language, that you can safely lay back and invest half of the time you saved into making your code just as fast as if you had written it in a low-level language right away. 您编写的80-95%的代码将从高级语言编写中受益匪浅,您可以安全地放松并投入一半的时间来保存代码,就像编写代码一样快用一种低级语言马上。

That being said, your comment that you want "to be able to distribute the code as standalone C/C++ libraries" is a valid reason to stick to plain C/C++. 话虽这么说,你想要“能够将代码作为独立的C / C ++库分发”的评论是坚持普通C / C ++的正当理由。 Cython always depends on CPython, which is quite a dependency. Cython总是依赖于CPython,这是一种依赖。 However, using plain C/C++ (except for the Python interface) will not allow you to take advantage of NumPy either, as that also depends on CPython. 但是,使用普通的C / C ++(Python接口除外)也不允许你利用NumPy,因为这也取决于CPython。 So, as usual when writing something in C, you will have to do a lot of ground work before you get to the actual functionality. 因此,像往常一样,在用C语言编写内容时,在进入实际功能之前,您必须做很多基础工作。 You should seriously think about this twice before you start this work. 在开始这项工作之前,你应该认真考虑这两次。

First, there is one point in your question I don't get: 首先,你的问题中有一点我没有得到:

[...] also want to be able to distribute the code as standalone C/C++ libraries. [...]也希望能够将代码作为独立的C / C ++库进行分发。 [...] Some functions will need to call a Python function from the C/C++ code. [...]某些函数需要从C / C ++代码调用Python函数。

How is this supposed to work? 这应该怎么样?

Next, as to your actual question, there are certainly advantages of using the Python/C API directly: 接下来,关于您的实际问题,直接使用Python / C API肯定有以下优点:

  • Most likely, you are more familar with writing C code than writing Cython code. 最有可能的是,编写C代码比编写Cython代码更熟悉。

  • Writing your code in C gives you maximum control. 用C编写代码可以最大限度地控制。 To get the same performance from Cython code as from equivalent C code, you'll have to be very careful. 要从Cython代码获得与等效C代码相同的性能,您必须非常小心。 You'll not only need to make sure to declare the types of all variables, you'll also have to set some flags adequately -- just one example is bounds checking . 你不仅需要确保声明所有变量的类型,还需要充分设置一些标志 - 只有一个例子是边界检查 You will need intimate knowledge how Cython is working to get the best performance. 您需要了解Cython如何努力获得最佳性能。

  • Cython code depends on Python. Cython代码依赖于Python。 It does not seem to be a good idea to write code that should also be distributed as standalone C library in Cython 编写应该在Cython中作为独立C库分发的代码似乎不是一个好主意

The main disadvantage of the Python/C API is that it can be very slow if it's used in an inner loop. Python / C API的主要缺点是,如果在内部循环中使用它,它可能会非常慢。 I'm seeing that calling a Python function takes a 80-160x hit over calling an equivalent C++ function. 我看到调用一个Python函数比调用一个等效的C ++函数需要80-160倍。

If that doesn't bother your code then you benefit from being able to write some chunks of code in Python, have access to Python libraries, support callbacks written directly in Python. 如果这不会打扰您的代码,那么您将能够在Python中编写一些代码块,访问Python库,支持直接用Python编写的回调。 That also means that you can make some changes without recompiling, making prototyping easier. 这也意味着您可以在不重新编译的情况下进行一些更改,从而使原型设计更容易。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM