简体   繁体   English

与C阵列相比,具有NumPy阵列存储器视图的Cython性能较差(呃)

[英]Poor(er) performance of Cython with NumPy array memoryview compared to C arrays

I encountered a pretty weird result from a benchmark 我从基准测试中遇到了一个非常奇怪的结果

Those are all different flavors of a bubblesort implementation, and the fastest approach at n=10^4 is converting a Python lists to C arrays internally. 这些都是bubblesort实现的不同风格,n = 10 ^ 4的最快方法是在内部将Python列表转换为C数组。 In contrast, the yellow line corresponds to code where I am using NumPy arrays with memoryview. 相反,黄线对应于我正在使用带有memoryview的NumPy数组的代码。 I am expected the results to be vice versa. 我预计结果反之亦然。 I (and colleagues) repeated the benchmark a couple of times and always got the same results. 我(及其同事)重复了几次基准测试,并始终得到相同的结果。 Maybe someone has an idea of what is going on here ... 也许有人知道这里发生了什么......

在此输入图像描述

The black line in the plot would correspond to the code: 图中的黑线对应于代码:

%%cython
cimport cython
from libc.stdlib cimport malloc, free

def cython_bubblesort_clist(a_list):
    """ 
    The Cython implementation of bubble sort with internal
    conversion between Python list objects and C arrays.

    """
    cdef int *c_list
    c_list = <int *>malloc(len(a_list)*cython.sizeof(int))
    cdef int count, i, j # static type declarations
    count = len(a_list)

    # convert Python list to C array
    for i in range(count):
        c_list[i] = a_list[i]

    for i in range(count):
        for j in range(1, count):
            if c_list[j] < c_list[j-1]:
                c_list[j-1], c_list[j] = c_list[j], c_list[j-1]

    # convert C array back to Python list
    for i in range(count):
        a_list[i] = c_list[i]

    free(c_list)
    return a_list

and the pink line to this code: 和此代码的粉色线:

%%cython
import numpy as np
cimport numpy as np
cimport cython
def cython_bubblesort_numpy(long[:] np_ary):
    """ 
    The Cython implementation of bubble sort with NumPy memoryview.

    """
    cdef int count, i, j # static type declarations
    count = np_ary.shape[0]

    for i in range(count):
        for j in range(1, count):
            if np_ary[j] < np_ary[j-1]:
                np_ary[j-1], np_ary[j] = np_ary[j], np_ary[j-1]

    return np.asarray(np_ary)

As suggested in the comments above, I added the decorators 正如上面的评论所示,我添加了装饰器

%%cython
import numpy as np
cimport numpy as np
cimport cython
@cython.boundscheck(False) 
@cython.wraparound(False)
cpdef cython_bubblesort_numpy(long[:] np_ary):
    """ 
    The Cython implementation of bubble sort with NumPy memoryview.

    """
    cdef int count, i, j # static type declarations
    count = np_ary.shape[0]

    for i in range(count):
        for j in range(1, count):
            if np_ary[j] < np_ary[j-1]:
                np_ary[j-1], np_ary[j] = np_ary[j], np_ary[j-1]

    return np.asarray(np_ary)

and the results are more what I expected now :) 结果更符合我的预期:)

在此输入图像描述

It is worth making one trivial change to your code to see if it improves things further: 值得对代码进行一次微不足道的更改,看看它是否能进一步改进:

cpdef cython_bubblesort_numpy(long[::1] np_ary):
    # ...

This tells cython that np_ary is a C contiguous array, and the generated code in the nested for loops can be further optimized with this information. 这告诉cython np_ary是一个C连续数组,并且可以使用此信息进一步优化嵌套for循环中生成的代码。

This code won't accept non-contiguous arrays as arguments, but that is fairly trivial to handle by using numpy.ascontiguousarray() . 此代码不接受非连续数组作为参数,但使用numpy.ascontiguousarray()处理这一点非常简单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM