[英]Sliding standard deviation on a 1D NumPy array
Suppose that you have an array and want to create another array, which's values are equal to standard deviation of first array's 10 elements successively.假设您有一个数组并想创建另一个数组,该数组的值连续等于第一个数组的 10 个元素的标准差。 With the help of for loop, it can be written easily like below code.
在 for 循环的帮助下,它可以像下面的代码一样轻松编写。 What I want to do is avoid using for loop for faster execution time.
我想要做的是避免使用 for 循环来加快执行时间。 Any suggestions?
有什么建议吗?
Code
a = np.arange(20)
b = np.empty(11)
for i in range(11):
b[i] = np.std(a[i:i+10])
You could create a 2D array of sliding windows with np.lib.stride_tricks.as_strided
that would be views into the given 1D
array and as such won't be occupying any more memory.您可以使用
np.lib.stride_tricks.as_strided
创建一个滑动窗口的二维数组,它将是给定1D
数组的视图,因此不会占用更多内存。 Then, simply use np.std
along the second axis (axis=1) for the final result in a vectorized way, like so -然后,简单地使用
np.std
沿着第二个轴(轴 = 1)以矢量化的方式获得最终结果,就像这样 -
W = 10 # Window size
nrows = a.size - W + 1
n = a.strides[0]
a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,W),strides=(n,n))
out = np.std(a2D, axis=1)
Runtime test运行时测试
Function definitions -函数定义 -
def original_app(a, W):
b = np.empty(a.size-W+1)
for i in range(b.size):
b[i] = np.std(a[i:i+W])
return b
def vectorized_app(a, W):
nrows = a.size - W + 1
n = a.strides[0]
a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,W),strides=(n,n))
return np.std(a2D,1)
Timings and verification -时间和验证 -
In [460]: # Inputs
...: a = np.arange(10000)
...: W = 10
...:
In [461]: np.allclose(original_app(a, W), vectorized_app(a, W))
Out[461]: True
In [462]: %timeit original_app(a, W)
1 loops, best of 3: 522 ms per loop
In [463]: %timeit vectorized_app(a, W)
1000 loops, best of 3: 1.33 ms per loop
So, around 400x
speedup there!所以,大约有
400x
加速!
For completeness, here's the equivalent pandas version -为了完整起见,这是等效的熊猫版本-
import pandas as pd
def pdroll(a, W): # a is 1D ndarray and W is window-size
return pd.Series(a).rolling(W).std(ddof=0).values[W-1:]
Not so fancy, but the code with no loops would be something like this:不是那么花哨,但没有循环的代码将是这样的:
a = np.arange(20)
b = [a[i:i+10].std() for i in range(len(a)-10)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.