[英]Handling np.NaN When Calculating row-wise Moving Average of a 2D Numpy Array
I'm trying to obtain an array containing the moving averages along the rows of a 2-dimensional numpy array , based on a certain 'window' (ie the number of rows included in the average) and an 'offset'.我试图基于某个“窗口”(即平均值中包含的行数)和“偏移量”来获取一个包含沿二维 numpy 数组行的移动平均值的数组。 I've come up with the code below which I know is not efficient:我想出了下面我知道效率不高的代码:
import numpy as np
def f(array, window, offset):
x = np.empty(array.shape)
x[:,:] = np.NaN
for row_num in range(array.shape[0]):
first_row = row_num - window - offset
last_row = row_num - offset + 1
if first_row >= 0:
x[row_num] = np.nanmean(array[first_row:last_row], axis=0)
return x
I've found a potential solution here , adapted below for my code:我在这里找到了一个潜在的解决方案,适用于我的代码:
import math
from scipy.ndimage import uniform_filter
def g(array, window, offset):
return uniform_filter(array, size=(window+1,1), mode='nearest', origin=(math.ceil((window+1)/2-1),0))
This solution, however, has 3 problems:然而,这个解决方案有 3 个问题:
Is there an efficient way to achieve what I'm trying to get?有没有一种有效的方法来实现我想要的目标?
As suggested by Ehsan, I've implemented the code below (with a small modification), which works as my original code for any offset above 0:正如 Ehsan 所建议的,我已经实现了下面的代码(稍作修改),它作为我的原始代码用于任何高于 0 的偏移量:
from skimage.util import view_as_windows
def h(array, window, offset):
return np.vstack(([[np.NaN]*array.shape[-1]]*(window+offset),np.vstack(np.nanmean(view_as_windows(array,(window+1,array.shape[-1])),-2)[:-offset])))
I'm just not sure how to make it work for any offset (in particular, offset=0).我只是不确定如何使它适用于任何偏移量(特别是偏移量 = 0)。 Also, this solution seems to consume more time than the original one:此外,此解决方案似乎比原始解决方案消耗更多时间:
a = np.arange(10*11).reshape(10,11)
%timeit f(a, 5, 2)
%timeit h(a, 5, 2)
>>> 36.6 µs ± 709 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> 67.5 µs ± 2.34 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I was wondering if there's any alternative which is less time consuming我想知道是否有其他更省时的替代方案
This will provide you the same output as your code, but I think you might want to rethink the extra +1
in last_row
definition, since it skips the last row and your actual window size would be window+1:这将为您提供与您的代码相同的输出,但我认为您可能需要重新考虑last_row
定义中的额外+1
,因为它跳过最后一行并且您的实际窗口大小将为 window+1:
from skimage.util import view_as_windows
def f(array, window, offset):
return np.vstack(([[np.NaN]*array.shape[-1]]*(window+offset),np.vstack(np.nanmean(view_as_windows(array,(window+1,array.shape[-1])),-2)[:array.shape[0]-window-offset])))
sample output:示例输出:
a = np.arange(7*6).reshape(7,6)
f(a, 2, 1)
#[[nan nan nan nan nan nan]
# [nan nan nan nan nan nan]
# [nan nan nan nan nan nan]
# [ 6. 7. 8. 9. 10. 11.]
# [12. 13. 14. 15. 16. 17.]
# [18. 19. 20. 21. 22. 23.]
# [24. 25. 26. 27. 28. 29.]]
Comparison using benchit
:使用benchit
比较:
#@OP's solution
def f1(array, window, offset):
x = np.empty(array.shape)
x[:,:] = np.NaN
for row_num in range(array.shape[0]):
first_row = row_num - window - offset
last_row = row_num - offset + 1
if first_row >= 0:
x[row_num] = np.nanmean(array[first_row:last_row], axis=0)
return x
#@Ehsan's solution
def f2(array, window, offset):
return np.vstack(([[np.NaN]*array.shape[-1]]*(window+offset),np.vstack(np.nanmean(view_as_windows(array,(window+1,array.shape[-1])),-2)[:array.shape[0]-window-offset])))
in_ = {n:[np.arange(n*10).reshape(n,10), 2,2] for n in [10,100,500,1000,4000]}
The proposed solution f2 is significantly faster.建议的解决方案f2明显更快。 You have to note that most vectorized solutions are efficient on larger arrays.您必须注意,大多数矢量化解决方案在较大的阵列上是有效的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.