简体   繁体   中英

Summing blocks of N rows in NumPy array

This probably is a duplicate question but I can't find the exact solution I need.

I am trying to sum every N rows (say 4 in this example) together. So in an 8 by 9 matrix, I would end up with a 2 by 9 array ie summing rows 0-3 together and then summing rows 4-7 together. Right now, this is the solution that I have but is there a way not to use list comprehension? Is there a more "numpy" way to do this? Because I end up with a list of 2 1-by-9 arrays rather than a single 2-by-9 array.

The input array is not fixed to 8-by-9, it can be 12-by-9 or 28-by-9 but the total number of rows of the input array will always be an integer multiple of N (which is 8 in this example and N=4)

>>> import numpy as np
>>> a = np.arange(72).reshape(8,9)
>>> a
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
   [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
   [18, 19, 20, 21, 22, 23, 24, 25, 26],
   [27, 28, 29, 30, 31, 32, 33, 34, 35],
   [36, 37, 38, 39, 40, 41, 42, 43, 44],
   [45, 46, 47, 48, 49, 50, 51, 52, 53],
   [54, 55, 56, 57, 58, 59, 60, 61, 62],
   [63, 64, 65, 66, 67, 68, 69, 70, 71]])

>>> b = [a[i:(i+1)*4,:] for i in range(0,len(a),4)]

>>> b
   [array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
   [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
   [18, 19, 20, 21, 22, 23, 24, 25, 26],
   [27, 28, 29, 30, 31, 32, 33, 34, 35]]), 
   array([[36, 37, 38, 39, 40, 41, 42, 43, 44],
   [45, 46, 47, 48, 49, 50, 51, 52, 53],
   [54, 55, 56, 57, 58, 59, 60, 61, 62],
   [63, 64, 65, 66, 67, 68, 69, 70, 71]])]

>>> b = [np.sum(a[i:(i+1)*4,:],axis=0) for i in range(0,len(a),4)]

>>> b

>>>[array([54, 58, 62, 66, 70, 74, 78, 82, 86]), array([198, 202, 206, 210, 
   214, 218, 222, 226, 230])]
In [120]: a
Out[120]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
       [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23, 24, 25, 26],
       [27, 28, 29, 30, 31, 32, 33, 34, 35],
       [36, 37, 38, 39, 40, 41, 42, 43, 44],
       [45, 46, 47, 48, 49, 50, 51, 52, 53],
       [54, 55, 56, 57, 58, 59, 60, 61, 62],
       [63, 64, 65, 66, 67, 68, 69, 70, 71]])

In [121]: a.reshape((2, 4, 9)).sum(axis=1)
Out[121]:
array([[ 54,  58,  62,  66,  70,  74,  78,  82,  86],
       [198, 202, 206, 210, 214, 218, 222, 226, 230]])

Reshape to split the first axis into two, such that the latter axis is of length equal to the window length = 4 , giving us a 3D array and then sum along that one, like so -

a.reshape(-1,4,a.shape[-1]).sum(1)

It works on generic shaped arrays, with that -1 in the reshaping method, as it computes the length along the first axis in the split/reshaped version on is own, giving us a generic solution.

A sample run to make things clear -

# Input array with 8 rows
In [15]: a = np.arange(72).reshape(8,9)

# Get output shape
In [16]: a.reshape(-1,4,a.shape[-1]).sum(1).shape
Out[16]: (2, 9)

# Input array with 28 rows
In [17]: a = np.arange(28*9).reshape(28,9)

# Get output shape
In [18]: a.reshape(-1,4,a.shape[-1]).sum(1).shape
Out[18]: (7, 9)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM