对部分微分方程积分器进行Cythonize

Question

I am trying to speed up a finite differences integrator for a partial differential equation using Cython. 我正在尝试使用Cython加速偏微分方程的有限差分积分器。 I am not sure what I need to do in order for Cython to work correctly with the numpy arrays. 我不确定要使Cython与numpy数组一起正常工作需要做什么。

The diffusion term function that I use is 我使用的扩散项函数是

def laplacian(var, dh2):
    """ (1D array, dx^2) -> laplacian(1D array)
    periodic_laplacian_1D_4th_order
    Implementing the 4th order 1D laplacian with periodic condition
    """
    lap = numpy.zeros_like(var)
    lap[1:]    = (4.0/3.0)*var[:-1]
    lap[0]     = (4.0/3.0)*var[1]
    lap[:-1]  += (4.0/3.0)*var[1:]
    lap[-1]   += (4.0/3.0)*var[0]
    lap       += (-5.0/2.0)*var

    lap[2:]   += (-1.0/12.0)*var[:-2]
    lap[:2]   += (-1.0/12.0)*var[-2:]
    lap[:-2]  += (-1.0/12.0)*var[2:]
    lap[-2:]  += (-1.0/12.0)*var[:2]

    return lap / dh2

And the rhs of the equations of the model are 并且模型方程的rhs为

from derivatives import laplacian

def dbdt(b,w,p,m,d,dx2):
    """ db/dt of Modified Klausmeier """
    return w*b**2 - m*b + laplacian(b,dx2)

def dwdt(b,w,p,m,d,dx2):
    """ dw/dt of Modified Klausmeier """
    return p - w - w*b**2 + d*laplacian(b,dx2)

How can I optimize those functions using Cython? 如何使用Cython优化这些功能？

I have a repository on Github for my working code, that integrates the Gray-Scott model - Gray-Scott model integrator . 我在Github上有一个工作代码存储库，该存储库集成了Gray-Scott模型-Gray-Scott模型集成器。

Answer 1

To use Cython efficiently, you should make all loops explicit and make sure cython -a shows as few Python calls as possible. 为了有效地使用Cython，您应该使所有循环都明确，并确保cython -a显示尽可能少的Python调用。 A first try would be: 第一次尝试是：

import numpy as np
cimport numpy as np
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
def laplacian(double [::1] var, double dh2):
    """ (1D array, dx^2) -> laplacian(1D array)
    periodic_laplacian_1D_4th_order
    Implementing the 4th order 1D laplacian with periodic condition
    """
    cdef int n = var.shape[0]
    cdef double[::1] lap = np.zeros(n)
    cdef int i
    for i in range(0, n-1):
        lap[1+i] = (4.0/3.0)*var[i]
    lap[0]     = (4.0/3.0)*var[1]
    for i in range(0, n-1):
        lap[i]  += (4.0/3.0)*var[1+i]
    lap[n-1]   += (4.0/3.0)*var[0]
    for i in range(0, n):
        lap[i]       += (-5.0/2.0)*var[i]

    for i in range(0, n-2):
        lap[2+i]   += (-1.0/12.0)*var[i]
    for i in range(0, 2):
        lap[i]   += (-1.0/12.0)*var[n - 2 + i]
    for i in range(0, n-2):
        lap[i]   += (-1.0/12.0)*var[i+2]
    for i in range(0, 2):
        lap[n-2+i]  += (-1.0/12.0)*var[i]
    for i in range(0, n):
        lap[i]  /= dh2
    return lap

Now this gives you: 现在，这给您：

$ python -m timeit -s 'import numpy as np; from lap import laplacian; var = np.random.rand(1000000); dh2 = .01' 'laplacian(var, dh2)'
100 loops, best of 3: 11.5 msec per loop

while the NumPy code gave: 而NumPy代码给出了：

100 loops, best of 3: 18.5 msec per loop

Note that the Cython could be further optimized by merging loops etc. 请注意，可以通过合并循环等来进一步优化Cython。

I also tried with a customized (ie not committed in master) version of Pythran and without changing the original Python code, I had the same speedup as the Cython version, without the hassle of converting the code: 我还尝试了自定义（即未提交到主版本）的Pythran版本，并且在不更改原始Python代码的情况下，与Cython版本具有相同的提速，而没有转换代码的麻烦：

#pythran export laplacian(float [], float)
import numpy
def laplacian(var, dh2):
    """ (1D array, dx^2) -> laplacian(1D array)
    periodic_laplacian_1D_4th_order
    Implementing the 4th order 1D laplacian with periodic condition
    """
    lap = numpy.zeros_like(var)
    lap[1:]    = (4.0/3.0)*var[:-1]
    lap[0]     = (4.0/3.0)*var[1]
    lap[:-1]  += (4.0/3.0)*var[1:]
    lap[-1]   += (4.0/3.0)*var[0]
    lap       += (-5.0/2.0)*var

    lap[2:]   += (-1.0/12.0)*var[:-2]
    lap[:2]   += (-1.0/12.0)*var[-2:]
    lap[:-2]  += (-1.0/12.0)*var[2:]
    lap[-2:]  += (-1.0/12.0)*var[:2]

    return lap / dh2

Converted with: 转换为：

$ pythran lap.py -O3

And I get: 我得到：

100 loops, best of 3: 11.6 msec per loop

Answer 2

So I guess I've figured it out, though I am not sure this is the most optimized way to do it: 所以我想我已经解决了，尽管我不确定这是最优化的方法：

import numpy as np
cimport numpy as np
cdef laplacian(np.ndarray[np.float64_t, ndim=1] var,np.float64_t dh2):
    """ (1D array, dx^2) -> laplacian(1D array)
    periodic_laplacian_1D_4th_order
    Implementing the 4th order 1D laplacian with periodic condition
    """
    lap = np.zeros_like(var)
    lap[1:]    = (4.0/3.0)*var[:-1]
    lap[0]     = (4.0/3.0)*var[1]
    lap[:-1]  += (4.0/3.0)*var[1:]
    lap[-1]   += (4.0/3.0)*var[0]
    lap       += (-5.0/2.0)*var

    lap[2:]   += (-1.0/12.0)*var[:-2]
    lap[:2]   += (-1.0/12.0)*var[-2:]
    lap[:-2]  += (-1.0/12.0)*var[2:]
    lap[-2:]  += (-1.0/12.0)*var[:2]

    return lap / dh2

I have used the following setup.py 我已经使用了以下setup.py

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("derivatives_c.pyx")
)

Any advice on improving it is welcome.. 欢迎提供任何改进建议。

对部分微分方程积分器进行Cythonize

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-09-11 19:44:45

解决方案2
0 2015-09-11 03:26:36

对部分微分方程积分器进行Cythonize

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-09-11 19:44:45

解决方案2 0 2015-09-11 03:26:36

解决方案1
1 已采纳 2015-09-11 19:44:45

解决方案2
0 2015-09-11 03:26:36