简体   繁体   中英

Cythonize two small numpy functions, help needed

The problem

I'm trying to Cythonize two small functions that mostly deal with numpy ndarrays for some scientific purpose. These two smalls functions are called millions of times in a genetic algorithm and account for the majority of the time taken by the algo.

I made some progress on my own and both work nicely, but i get only a tiny speed improvement (10%). More importantly, cython --annotate show that the majority of the code is still going through Python.

The code

First function:

The aim of this function is to get back slices of data and it is called millions of times in an inner nested loop. Depending on the bool in data[1][1], we either get the slice in the forward or reverse order.

#Ipython notebook magic for cython
%%cython --annotate
import numpy as np
from scipy import signal as scisignal

cimport cython
cimport numpy as np
def get_signal(data):
    #data[0] contains the data structure containing the numpy arrays
    #data[1][0] contains the position to slice
    #data[1][1] contains the orientation to slice, forward = 0, reverse = 1

    cdef int halfwinwidth = 100
    cdef int midpoint = data[1][0]
    cdef int strand = data[1][1]
    cdef int start = midpoint - halfwinwidth
    cdef int end = midpoint + halfwinwidth
    #the arrays we want to slice
    cdef np.ndarray r0 = data[0]['normals_forward']
    cdef np.ndarray r1 = data[0]['normals_reverse']
    cdef np.ndarray r2 = data[0]['normals_combined']
    if strand == 0:
        normals_forward = r0[start:end]
        normals_reverse = r1[start:end]
        normals_combined = r2[start:end]
    else:
        normals_forward = r1[end - 1:start - 1: -1]
        normals_reverse = r0[end - 1:start - 1: -1]
        normals_combined = r2[end - 1:start - 1: -1]
    #return the result as a tuple
    row = (normals_forward,
           normals_reverse,
           normals_combined)
    return row

Second function

This one gets a list of tuples of numpy arrays, and we want to add up the arrays element wise, then normalize them and get the integration of the intersection.

def calculate_signal(list signal):
    cdef int halfwinwidth = 100
    cdef np.ndarray profile_normals_forward = np.zeros(halfwinwidth * 2, dtype='f')
    cdef np.ndarray profile_normals_reverse = np.zeros(halfwinwidth * 2, dtype='f')
    cdef np.ndarray profile_normals_combined = np.zeros(halfwinwidth * 2, dtype='f')
    #b is a tuple of 3 np.ndarrays containing 200 floats
    #here we add them up elementwise
    for b in signal:
        profile_normals_forward += b[0]
        profile_normals_reverse += b[1]
        profile_normals_combined += b[2]
    #normalize the arrays
    cdef int count = len(signal)

    #print "Normalizing to number of elements"
    profile_normals_forward /= count
    profile_normals_reverse /= count
    profile_normals_combined /= count
    intersection_signal = scisignal.detrend(np.fmin(profile_normals_forward, profile_normals_reverse))
    intersection_signal[intersection_signal < 0] = 0
    intersection = np.sum(intersection_signal)

    results = {"intersection": intersection,
               "profile_normals_forward": profile_normals_forward,
               "profile_normals_reverse": profile_normals_reverse,
               "profile_normals_combined": profile_normals_combined,
               }
    return results

Any help is appreciated - I tried using memory views but for some reason the code got much, much slower.

After fixing the array cdef (as has been indicated, with the dtype specified), you should probably put the routine in a cdef function (which will only be callable by a def function in the same script).

In the declaration of the function, you'll need to provide the type (and the dimensions if it's an array numpy):

cdef get_signal(numpy.ndarray[DTYPE_t, ndim=3] data):

I'm not sure using a dict is a good idea though. You could make use of numpy's column or row slices like data[:, 0].

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM