简体   繁体   中英

Why cython embeded plugins has higher performance in cpython interpreter than rust-c interface versions?

I would like to ask some questions about the underlying principles of python interpreters, because I didn't get much useful information during my own search.

I've been using rust to write python plugins lately, this gives a significant speedup to python's cpu-intensive tasks, and it's also faster to write comparing to c. However it has one disadvantage is that, compared to the old scheme of using cython to accelerate, the call overhead of rust (I'm using pyo3) seems to be greater than that of c(I'm using cython),

For example, we got an empty python function here:

def empty_function():
    return 0

Call it a million times over in Python via a for loop and count the time, so that we can find out each single call takes about 70 nanosecond(in my pc).

And if we compile it to a cython plugin, with the same source code:

# test.pyx
cpdef unsigned int empty_function():
    return 0

The execution time will be reduced to 40 nanoseconds. Which means that we can use cython for some fine-grained embedding, and we can expect it to always execute faster than native python.

However when it comes to Rust, (Honesty speaking, I prefer to use rust for plugin development rather than cython now cause there's no need to do some weird hacking in grammar), the call time will increase to 140 nanoseconds, almost twice as much as native python. Source code as follow:

use pyo3::prelude::*;
use pyo3::wrap_pyfunction;

#[pyfunction]
fn empty_function() -> usize {
    0
}

#[pymodule]
fn testlib(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(empty_function, m)?)?;
    Ok(())
}

This means that rust is not suitable for fine-grained embedded replacement of python. If there is a task whose call time is very few and each call takes a long time, then it is perfect to use rust. However if there's a task will be called a lot in the code, then it seems not suitable for rust, cause the overhead of type conversion will take up most of the accelerated time.

I want to know if this can be solved and, more importantly, I want to know the underlying rationale for this discrepancy. Is there some kind of difference with the cpython interpreter when calling between them, like the difference between cpython and pypy when calling c plugins? Where can I get further information? Thanks.

===

Update:

Sorry guys, I didn't anticipate that my question would be ambiguous, after all, the source code for all three has been given, and using timeit to test function runtimes is an almost convention in python development.

My test code is nearly all the same with @Jmb 's code in comment, with some subtle differences that I'm using python setup.py build_ext --inplace way to build instead of bare gcc, but that should not make any difference. Anyway, thanks for supplementary.

It's also worth noting here that compiling rust extensions with python setup.py build_ext --inplace builds them in unoptimised mode (same goes for python setup.py develop or pip install -e. ).

Here's an excerpt from the output of:

Finished dev [unoptimized + debuginfo] target(s) in 0.02s

To build in "release" mode with an optimised binary, use:

pip install .

With pip install. --verbose pip install. --verbose you can see the difference:

Finished release [optimized] target(s) in 1.02s

This can make a massive difference, in my case the unoptimised build is 9x slower than the optimised build.

As suggested in the comments, this is a self-answer.

Since the discussion in the comments section did not lead to a clear conclusion, I went to raise an issue in pyo3's repo and get response from whose main maintainer.

In short, the conclusion is that there is no fundamental difference between the plugins compiled by pyo3 or cython when cpython calling them. The current speed difference comes from the different depth of optimization.

Here is the link to the issue: https://github.com/PyO3/pyo3/issues/1470

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM