简体   繁体   English

一维 NumPy 数组上的滑动标准差

[英]Sliding standard deviation on a 1D NumPy array

Suppose that you have an array and want to create another array, which's values are equal to standard deviation of first array's 10 elements successively.假设您有一个数组并想创建另一个数组,该数组的值连续等于第一个数组的 10 个元素的标准差。 With the help of for loop, it can be written easily like below code.在 for 循环的帮助下,它可以像下面的代码一样轻松编写。 What I want to do is avoid using for loop for faster execution time.我想要做的是避免使用 for 循环来加快执行时间。 Any suggestions?有什么建议吗?

Code
a = np.arange(20)
b = np.empty(11)
for i in range(11):
    b[i] = np.std(a[i:i+10])

You could create a 2D array of sliding windows with np.lib.stride_tricks.as_strided that would be views into the given 1D array and as such won't be occupying any more memory.您可以使用np.lib.stride_tricks.as_strided创建一个滑动窗口的二维数组,它将是给定1D数组的视图,因此不会占用更多内存。 Then, simply use np.std along the second axis (axis=1) for the final result in a vectorized way, like so -然后,简单地使用np.std沿着第二个轴(轴 = 1)以矢量化的方式获得最终结果,就像这样 -

W = 10 # Window size
nrows = a.size - W + 1
n = a.strides[0]
a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,W),strides=(n,n))
out = np.std(a2D, axis=1)

Runtime test运行时测试

Function definitions -函数定义 -

def original_app(a, W):
    b = np.empty(a.size-W+1)
    for i in range(b.size):
        b[i] = np.std(a[i:i+W])
    return b
    
def vectorized_app(a, W):
    nrows = a.size - W + 1
    n = a.strides[0]
    a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,W),strides=(n,n))
    return np.std(a2D,1)

Timings and verification -时间和验证 -

In [460]: # Inputs
     ...: a = np.arange(10000)
     ...: W = 10
     ...: 

In [461]: np.allclose(original_app(a, W), vectorized_app(a, W))
Out[461]: True

In [462]: %timeit original_app(a, W)
1 loops, best of 3: 522 ms per loop

In [463]: %timeit vectorized_app(a, W)
1000 loops, best of 3: 1.33 ms per loop

So, around 400x speedup there!所以,大约有400x加速!

For completeness, here's the equivalent pandas version -为了完整起见,这是等效的熊猫版本-

import pandas as pd

def pdroll(a, W): # a is 1D ndarray and W is window-size
    return pd.Series(a).rolling(W).std(ddof=0).values[W-1:]

Not so fancy, but the code with no loops would be something like this:不是那么花哨,但没有循环的代码将是这样的:

a = np.arange(20)
b = [a[i:i+10].std() for i in range(len(a)-10)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM