简体   繁体   English

如何根据输入获取一维numpy数组的所有可能切片

[英]How to get all possible slices of a 1D numpy array depending on input

I have a numpy array 我有一个numpy数组

a = np.arange(12)
>>> [0,1,2,3,4,5,6,7,8,9,10,11]

I am trying to calculate all possible cumsums like this 我正在尝试计算所有可能的累积量

np.cumsum[2:] + np.cumsum[:-2]
np.cumsum[3:] + np.cumsum[:-3]
...
np.cumsum[11:] + np.cumsum[:-11]

How can I achieve this without a loop I tried doing 如何在没有尝试执行的循环的情况下实现此目标

starts = np.arange(2,12)
np.cumsum[starts:] + np.cumsum[:-starts]
but I get this error
TypeError: only integer scalar arrays can be converted to a scalar index

How do I do this without a for loop 我如何在没有for循环的情况下执行此操作

What I am trying to do 我要做什么

I am trying to calculate moving average of all possible time frames within the length of a sequence. 我正在尝试计算序列长度内所有可能时间范围的移动平均值。 For example, if I had an array size of 10, I could do moving average 1 period (doesn't make sense) , moving average 2 periods, 3 periods...10 periods. 例如,如果我的数组大小为10,则可以执行移动平均1个周期(没有意义),移动平均2个周期,3个周期... 10个周期。 How do I accomplish this. 我该如何做到这一点。 I want to calculate the moving average from 2 to n where n is the size of the sequence 我想计算从2到n的移动平均值,其中n是序列的大小

Not sure I understood the question completely, here's something you could use as a starting point. 不确定我是否完全理解了这个问题,您可以以此为起点。

You need arrays with uniform sizes to be able to exploit vectorization. 您需要大小一致的数组才能利用向量化。 You cannot do it with simple slicing but zero padding can help in this case: 您无法通过简单的切片来做到这一点,但是在这种情况下零填充可以有所帮助:

In [3]: a = np.arange(12)

In [4]: a
Out[4]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [15]: starts = np.arange(2,12)

In [18]: left = np.stack([np.pad(a,(0,s),mode="constant")[s:] for s in starts])

In [19]: left
Out[19]: 
array([[ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11,  0,  0],
       [ 3,  4,  5,  6,  7,  8,  9, 10, 11,  0,  0,  0],
       [ 4,  5,  6,  7,  8,  9, 10, 11,  0,  0,  0,  0],
       [ 5,  6,  7,  8,  9, 10, 11,  0,  0,  0,  0,  0],
       [ 6,  7,  8,  9, 10, 11,  0,  0,  0,  0,  0,  0],
       [ 7,  8,  9, 10, 11,  0,  0,  0,  0,  0,  0,  0],
       [ 8,  9, 10, 11,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 9, 10, 11,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [10, 11,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [11,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]])

Here you need to also shift everything to the left to get proper alignment: 在这里,您还需要将所有内容向左移动以获得正确的对齐方式:

In [27]: right = np.stack([ np.roll(np.pad(a, (s,0), mode="constant")[:-s], -s) for s in starts ])

In [28]: right
Out[28]: 
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 0, 0],
       [0, 1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0],
       [0, 1, 2, 3, 4, 5, 6, 0, 0, 0, 0, 0],
       [0, 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0],
       [0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Now you can use vectorized np.cumsum for the intensive part 现在,您可以将矢量化的np.cumsum用于密集部分

In [41]: np.cumsum(left, axis=1) + np.cumsum(right, axis=1)
Out[41]:
array([[  2,   6,  12,  20,  30,  42,  56,  72,  90, 110, 110, 110],
       [  3,   8,  15,  24,  35,  48,  63,  80,  99,  99,  99,  99],
       [  4,  10,  18,  28,  40,  54,  70,  88,  88,  88,  88,  88],
       [  5,  12,  21,  32,  45,  60,  77,  77,  77,  77,  77,  77],
       [  6,  14,  24,  36,  50,  66,  66,  66,  66,  66,  66,  66],
       [  7,  16,  27,  40,  55,  55,  55,  55,  55,  55,  55,  55],
       [  8,  18,  30,  44,  44,  44,  44,  44,  44,  44,  44,  44],
       [  9,  20,  33,  33,  33,  33,  33,  33,  33,  33,  33,  33],
       [ 10,  22,  22,  22,  22,  22,  22,  22,  22,  22,  22,  22],
       [ 11,  11,  11,  11,  11,  11,  11,  11,  11,  11,  11,  11]])

Now you probably need to clean up the result to get what you want, but I'm still not sure, it would be great if you could post the expected output. 现在您可能需要清理结果以得到所需的结果,但是我仍然不确定,如果可以发布预期的输出,那就太好了。 Something like this should do: 这样的事情应该做:

In [50]: [ row[:-s] for row,s in zip(csum,starts) ]
Out[50]: 
[array([  2,   6,  12,  20,  30,  42,  56,  72,  90, 110]),
 array([ 3,  8, 15, 24, 35, 48, 63, 80, 99]),
 array([ 4, 10, 18, 28, 40, 54, 70, 88]),
 array([ 5, 12, 21, 32, 45, 60, 77]),
 array([ 6, 14, 24, 36, 50, 66]),
 array([ 7, 16, 27, 40, 55]),
 array([ 8, 18, 30, 44]),
 array([ 9, 20, 33]),
 array([10, 22]),
 array([11])]

It is not what you asked for. 这不是您要的。 But if you are looking for a simpler solution , you can use the pandas approach. 但是,如果您正在寻找更简单的解决方案,则可以使用pandas方法。

df = pd.DataFrame({'a' :np.arange(11)})  # your data 
window_lengths = np.arange(2,len(a))  # define window lengths from 2 to n
[rolling_win.mean() for rolling_win in [df.rolling(length) for length in window_lengths]]

output : 输出:

 [      a
     0   NaN
     1   0.5
     2   1.5
     3   2.5
     4   3.5
     5   4.5
     6   5.5
     7   6.5
     8   7.5
     9   8.5
     10  9.5,       a
     0   NaN
     1   NaN
     2   1.0
     3   2.0
     4   3.0
     5   4.0
     6   5.0
     7   6.0
     8   7.0
     9   8.0
     10  9.0,       a
     0   NaN
     1   NaN
     2   NaN
     3   1.5
     4   2.5
     5   3.5
     6   4.5
     7   5.5
     8   6.5
     9   7.5
     10  8.5,       a
     0   NaN
     1   NaN
     2   NaN
     3   NaN
     4   2.0
     5   3.0
     6   4.0
     7   5.0
     8   6.0
     9   7.0
     10  8.0,       a
     0   NaN
     1   NaN
     2   NaN
     3   NaN
     4   NaN
     5   2.5
     6   3.5
     7   4.5
     8   5.5
     9   6.5
     10  7.5,       a
     0   NaN
     1   NaN
     2   NaN
     3   NaN
     4   NaN
     5   NaN
     6   3.0
     7   4.0
     8   5.0
     9   6.0
     10  7.0,       a
     0   NaN
     1   NaN
     2   NaN
     3   NaN
     4   NaN
     5   NaN
     6   NaN
     7   3.5
     8   4.5
     9   5.5
     10  6.5,       a
     0   NaN
     1   NaN
     2   NaN
     3   NaN
     4   NaN
     5   NaN
     6   NaN
     7   NaN
     8   4.0
     9   5.0
     10  6.0,       a
     0   NaN
     1   NaN
     2   NaN
     3   NaN
     4   NaN
     5   NaN
     6   NaN
     7   NaN
     8   NaN
     9   4.5
     10  5.5]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM