使用 numpy 和 numba Python 優化計算

Question

我試圖讓 python 使用 numba 和 numpy 更快地運行標准差函數。 但是問題是 for 循環非常慢，我需要替代方案，以便我可以使代碼更快。 我將 numba 迭代到已經存在的 numpy 版本，但是性能並沒有太大的提升。 我的原始列表_中list_百萬個值，因此計算標准偏差 function 需要很長時間。 下面的list_ function 是一個非常短的 numpy 數組，它是我的問題的一個例子，因為我無法發布原始列表編號。 下面 function 中的 for 循環計算由下面list_中的變量number定義的每 n 個數字的標准偏差。 我怎樣才能使當前的 function 運行得更快。

import numpy as np
from numba import njit,jit,vectorize

number = 5
list_= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

正常代碼：

def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    print(std)
std_()

努巴代碼：

jitted_func = njit()(std_)
jitted_func()

性能結果：

Answer 1

您可以以矢量化方式執行此操作。

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    return std

std1 = np.std(rolling_window(list_, 5), axis=1)
print(np.allclose(std1[:-1], std_()))

給出True 。 rolling_window的代碼取自這個答案。

與 numba 的比較 -

import numpy as np
from numba import njit,jit,vectorize

number = 5
list_= np.random.rand(10000)

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    return std

%timeit np.std(rolling_window(list_, 5), axis=1)
%%timeit
jitted_func = njit()(std_)
jitted_func()

給

499 µs ± 3.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
106 ms ± 2.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

使用 numpy 和 numba Python 優化計算

問題描述

1 個解決方案

解決方案1
0 2021-01-17 05:52:28

使用 numpy 和 numba Python 優化計算

問題描述

1 個解決方案

解決方案1 0 2021-01-17 05:52:28

解決方案1
0 2021-01-17 05:52:28