简体   繁体   English

numpy 矢量化重采样,如 Pandas DataFrame 重采样

[英]numpy vectorized resampling like pandas DataFrame resample

I have an (4, 2000) numpy array and want to resample each column (N=4) for every 5 elements with such as max, min, left, right, which makes its shape as (4, 400).我有一个(4, 2000) numpy 数组,并希望为每 5 个元素重新采样每列(N=4) ,例如 max、min、left、right,这使其形状为 (4, 400)。

I can do with Pandas.DataFrame using .resample('5Min').agg(~) or with numpy array and for loop like result = [max(input[i:i+5]) for i in range(0, len(input), 5)] .我可以使用Pandas.DataFrame使用.resample('5Min').agg(~)或使用 numpy 数组和 for 循环,例如result = [max(input[i:i+5]) for i in range(0, len(input), 5)] . However, it takes amount of time with large input array since it's not vectorized.但是,大型输入数组需要花费大量时间,因为它不是矢量化的。 Is there any way that I can do with vectorized computation with np array?有什么办法可以用 np 数组进行矢量化计算吗?

Here is another way that uses numpy strides under the hood ( a is your array):这是在引擎盖下使用 numpy strides 的另一种方法( a是您的数组):

from skimage.util import view_as_blocks
a = view_as_blocks(a, (4,5))

Now, you can use methods/slicing for parameters you want:现在,您可以对所需的参数使用方法/切片:

#max
a.max(-1)[0].T
#min
a.min(-1)[0].T
#left
a[...,0][0].T
#right
a[...,-1][0].T

example:例子:

a
#[[ 0  1  2  3  4  5  6  7  8  9]
# [10 11 12 13 14 15 16 17 18 19]
# [20 21 22 23 24 25 26 27 28 29]
# [30 31 32 33 34 35 36 37 38 39]]

output for max
#[[ 4  9]
# [14 19]
# [24 29]
# [34 39]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM