I have been trying to learn Cython to speed up some of my calculations. Here is a subset of what I am trying to do: this is simply integrating a differential equation using a recursive formula while making use of NumPy arrays. I have already achieved a factor of ~100x speed increase over the pure python version. However it seems like I can gain added speed based on looking at the HTML file generated for my code by the -a
cython command. My code is as follows (lines that become yellow in the HTML file that I would like to make white are labeled):
%%cython
import numpy as np
cimport numpy as np
cimport cython
from libc.math cimport exp,sqrt
@cython.boundscheck(False)
cdef double riccati_int(double j, double w, double h, double an, double d):
cdef:
double W
double an1
W = sqrt(w**2 + d**2)
#dark_yellow
an1 = ((d - (W + w) * an) * exp(-2 * W * h / j ) - d - (W - w) * an) /
((d * an - W + w) * exp(-2 * W * h / j) - d * an - W - w)
return an1
def acalc(double j, double w):
cdef:
int xpos, i, n
np.ndarray[np.int_t, ndim=1] xvals
np.ndarray[np.double_t, ndim=1] h, a
xpos = 74
xvals = np.array([0, 8, 23, 123, 218], dtype=np.int) #dark_yellow
h = np.array([1, .1, .01, .1], dtype=np.double) #dark_yellow
a = np.empty(219, dtype=np.double) #dark_yellow
a[0] = 1 / (w + sqrt(w**2 + 1)) #light_yellow
for i in range(h.size): #dark_yellow
for n in range(xvals[i], xvals[i + 1]): #light_yellow
if n < xpos:
a[n+1] = riccati_int(j, w, h[i], a[n], 1.) #light_yellow
else:
a[n+1] = riccati_int(j, w, h[i], a[n], 0.) #light_yellow
return a
It seems to me like all 9 lines that I labeled above should be able to be made white with the proper adjustments. One issue is the ability to define NumPy arrays the proper way. But probably even more important is the ability to get the first labeled line to work efficiently, since this is where the bulk of the calculation is done. I tried reading the generated C code that the HTML file displays after clicking on a yellow line, but I honestly have no clue how to read that code. If anybody could please help me out, it would be greatly appreciated.
I think you don't need to care about yellow lines that is not in loop. Add following compiler directives will make the three lines in loop faster:
@cython.cdivision(True)
cdef double riccati_int(double j, double w, double h, double an, double d):
pass
@cython.boundscheck(False)
@cython.wraparound(False)
def acalc(double j, double w):
pass
I'm not sure, whether it makes a difference, but you could do use memory-views for the arrays, eg
cdef double [:] h = np.array([1, .1, .01, .1], dtype=np.double) #dark_yellow
cdef double [:] a = np.empty(219, dtype=np.double) #dark_yellow
Also creating an numpy array for four static values is a bit overdone. This can be replaced by a static C array
cdef double *h = [1, .1, .01, .1]
However, as mentioned, what in the loop is, that matters most. Since line profiler won't work for cython (afaik) use time
module to benchmark within the function, besides using cProfile
. It might give you an idea, that the intensity of the line color in the cython log has to be assessed in context.
It is recommended to use the python types for indexing, as I learned
size_t i, n
Py_ssize_t i, n
The second one is the signed version
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.