简体   繁体   English

在python中矢量化太慢? 还是quad.integrate? 还是我的代码?

[英]Vectorize too slow in python? Or is it quad.integrate? Or is it my code?

I am new to community and please pardon me if I didn't provide information as intended.我是社区的新手,如果我没有按预期提供信息,请原谅我。

I am trying to learn python, coming from Matlab.我正在尝试学习 python,来自 Matlab。 I have a very simple code for starting purposes:我有一个非常简单的代码用于启动目的:

from numpy import vectorize
from scipy import integrate
from scipy.special import j1
from math import sqrt, exp, pi, log
import matplotlib.pyplot as plt
import numpy as np



def plot_hc(radius, pd):
    q = np.linspace(0.008, 1.0, num=500)
    y = hc(q, radius, pd)
    plt.loglog(q, y)
    plt.show()

def hc_formfactor(q, radius):
    y = (1.0 / q) * (radius * j1(q * radius))
    y = y ** 2
    return y

def g_distribution(z, radius, pd):
    return (1 / (sqrt(2 * pi) * pd)) * exp(
        -((z - radius) / (sqrt(
            2) * pd)) ** 2)


def ln_distribution(z, radius, pd):
    return (1 / (sqrt(2 * pi) * pd * z / radius)) * exp(
        -(log(z / radius) / (sqrt(2) * pd)) ** 2)

# Dist=1(for G_Distribution)
# Dist=2(for LN Distribution)
Dist = 1


@vectorize
def hc(x, radius, pd):
    global d
    if Dist == 1:
        nmpts = 4
        va = radius - nmpts * pd
        vb = radius + nmpts * pd
        if va < 0:
             va = 0
        d = integrate.quad(lambda z: g_distribution(z, radius, pd), va, vb)
    elif Dist == 2:
        nmpts = 4
        va = radius - nmpts * pd
        vb = radius + nmpts * pd
        if va < 0:
            va = 0
        d = integrate.quad(lambda z: ln_distribution(z, radius, pd), va, vb)
    else:
       d = 1

   def fun(z, x, radius, pd):
        if Dist == 1:
            return hc_formfactor(x, z) * g_distribution(z, radius, pd)
        elif Dist == 2:
            return hc_formfactor(x, z) * ln_distribution(z, radius, pd)
        else:
            return hc_formfactor(x, z)

    y = integrate.quad(lambda z: fun(z, x, radius, pd), va, vb)[0]
    return y/d[0]

if __name__ == '__main__':
plot_hc(radius=40, pd=0.5)

As suggested by some, I should use for loop, but that reduced the speed even more.正如一些人所建议的那样,我应该使用 for 循环,但这会进一步降低速度。 The code using for loop is as follows:使用for循环的代码如下:

from numpy import vectorize
from scipy import integrate
from scipy.special import j1
from math import sqrt, exp, pi, log
import matplotlib.pyplot as plt
import numpy as np



def plot_hc(radius, pd):
    q = np.linspace(0.008, 1.0, num=500)
    y = hc(q, radius, pd)
    plt.loglog(q, y)
    plt.show()

def hc_formfactor(q, radius):
    y = (1.0 / q) * (radius * j1(q * radius))
    y = y ** 2
    return y

def g_distribution(z, radius, pd):
    return (1 / (sqrt(2 * pi) * pd)) * exp(
        -((z - radius) / (sqrt(
            2) * pd)) ** 2)


def ln_distribution(z, radius, pd):
    return (1 / (sqrt(2 * pi) * pd * z / radius)) * exp(
        -(log(z / radius) / (sqrt(2) * pd)) ** 2)

# Dist=1(for G_Distribution)
# Dist=2(for LN Distribution)
Dist = 1


def hc(q, radius, pd):
    if Dist == 1:
        nmpts = 4
        va = radius - nmpts * pd
        vb = radius + nmpts * pd
        if va < 0:
            va = 0
        d = integrate.quad(lambda z: g_distribution(z, radius, pd), va,vb)
    elif Dist == 2:
        nmpts = 4
        va = radius - nmpts * pd
        vb = radius + nmpts * pd
        if va < 0:
            va = 0
        d = integrate.quad(lambda z: ln_distribution(z, radius, pd), va, vb)
    else:
        d = 1

    def fun(z, q, radius, pd):
        if Dist == 1:
            return hc_formfactor(q, z) * g_distribution(z, radius, pd)
        elif Dist == 2:
            return hc_formfactor(q, z) * ln_distribution(z, radius, pd)
        else:
            return hc_formfactor(q, z)

    y = empty([len(q)])
    for n in range(len(q)):
        y[n] = integrate.quad(lambda z: fun(z, q[n], radius, pd), va, vb)[0]

    return y / d[0]
if __name__ == '__main__':
plot_hc(radius=40, pd=0.5)

If I run the same program for the same values in Matlab it is very very fast.如果我在 Matlab 中为相同的值运行相同的程序,它会非常快。 I don't know what mistake I did here.我不知道我在这里犯了什么错误。 Please help :)请帮忙 :)

Notes笔记

The vectorize function is provided primarily for convenience, not for performance.提供vectorize函数主要是为了方便,而不是为了性能。 The implementation is essentially a for loop.实现本质上是一个 for 循环。

MATLAB has jit compilation abilities, numpy does not ( numba provides some of that). MATLAB 有jit编译能力, numpy没有( numba提供了一些)。 To get the best numpy speed you need to think in terms of whole-array operations, just like we used to in the old MATLAB days.为了获得最佳的numpy速度,您需要考虑整个数组的操作,就像我们在旧的 MATLAB 时代一样。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM