简体   繁体   中英

In Python, why is a module implemented in C faster than a pure Python module, and how do I write one?

The python documentation states, that the reason cPickle is faster than Pickle is, that the former is implemented in C. What does that mean exactly?

I am making a module for advanced mathematics in Python, and some calculations take a significant amount of time. Does that mean that if my program is implemented in C it can be made much faster?

I wish to import this module from other Python programs, just the way I can import cPickle.

Can you explain how to do implement a Python module in C?

You can write fast C code and then use it in your python scripts, so your program will run faster.[1] http://docs.python.org/extending/index.html#extending-index

An example is Numpy, written in C ( https://numpy.org/ )

Typical use is to implement the bottleneck in C (or to use a library written in C, of course ;) ), due to its speed, and to use python for the remaining code

[1] by the way, this is why cPickle is faster than pickle

edit:

take a look at Pyrex: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/About.html

'Pyrex is a language specially designed for writing Python extension modules. It's designed to bridge the gap between the nice, high-level, easy-to-use world of Python and the messy, low-level world of C. '

It's not the 'official' way but it may be useful

As mentioned, numpy is excellent for vector computations. (Could be better still, but the comment that it's better than anything you could write without actually doing work is definitely true.)

Not everything can be easily vectorized, though, so if you do have tight inner loops with lots of function calls (say a heavily recursive algorithm) you still have a couple of options: probably the most popular is Cython , which allows you to write modules and functions in a kind of annotated Python and get C-like speed when you need it.

Or maybe your time is all dominated by library calls to compute eigenvalues or invert matrices or evaluate special functions or divide really large integers -- many of which the Sage project handles very well, by the way, if what you're doing is more mathematical than pure crunching -- in which case the time spent in Python might not even matter. It all depends on the details of the kind of numerics you're doing.

When you write a function in python, a new function object is created, the function code is parsed and bytecompiled[and saved in the "func_code" attribute], so when you call that function the interpreter reads its bytecode and executes it.

If you write the same function in C, following C/Python API to make it avaiable in python, the interpreter will create the function object, but this function won't have a bytecode. When the interpreter finds a call to that function it calls the real C function, thus it executes at "machine" speed and not at "python-machine" speed.

You can verify this checking functions written in C:

>>> map.func_code
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'builtin_function_or_method' object has no attribute 'func_code'
>>> def mymap():pass
... 
>>> mymap.func_code
<code object mymap at 0xcfb5b0, file "<stdin>", line 1>

To understand how you can write C code for python use follow the guides in the official site.

Anyway, if you are simply doing N-dimensional array calculations numpy ought to be sufficient.

Besides Pyrex/Cython, already mentioned, you have other alternatives:

Shed Skin : Translates (a restricted subset of) Python to C++. Can automatically generate an extension for you. You'd create an extension doing this (assuming Linux):

wget http://shedskin.googlecode.com/files/shedskin-0.7.tgz
tar -xzf shedskin-0.7.tgz
# On your code folder:
PYTHONPATH=/path/to/shedskin-0.7 python shedskin -e yourmodule.py
# The above generates a Makefile and a yourmodule.h/.cpp pair
make
# Now you can "import yourmodule" from Python and check it's from the .so by "print yourmodule.__file__

PyPy : A faster Python, with a JIT compiler. You could simply run your code on it instead of CPython. Only supports Python 2.5 now, 2.7 support soon. Can give huge speedups on math-heavy code. To install and run it (assuming Linux 32-bit):

wget http://pypy.org/download/pypy-1.4.1-linux.tar.bz2
tar -xjf pypy-1.4.1-linux.tar.bz2
sudo ln -s /path/to/pypy-1.4.1-linux/bin/pypy /usr/local/bin
# Then, instead of "python yourprogram.py" you'll just run "pypy yourprogram.py"

Weave : Allows you to write C inline , the compiles it.

Edit: If you want us to run these tools for you and benchmark, just post your code ;)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM