简体   繁体   中英

Slicing 2D arrays using indices from arrays in python

I'm working with slices of a 2D numpy array. To select the slices, I have the indices stored in arrays. For example, I have:

mat = np.zeros([xdim,ydim], float)
xmin = np.array([...]) # Array of minimum indices in x
xmax = np.array([...]) # Array of maximum indices in x
ymin = np.array([...]) # Array of minimum indices in y
ymax = np.array([...]) # Array of maximum indices in y
value = np.array([...]) # Values

Where ... just denotes some integer numbers previously calculated. All arrays are well-defined and have lengths of ~265000. What I want to do is something like:

mat[xmin:xmax, ymin:ymax] += value

In such a way that for the first elements I would have:

mat[xmin[0]:xmax[0], ymin[0]:ymax[0]] += value[0]
mat[xmin[1]:xmax[1], ymin[1]:ymax[1]] += value[1]

and so on, for the ~265000 elements of the array. Unfortunately what I just wrote is not working, and it is throwing the error: IndexError: invalid slice .

I've been trying to use np.meshgrid as suggested here: NumPy: use 2D index array from argmin in a 3D slice , but it hasn't worked for me yet. Besides, I'm looking for a pythonic way to do so, avoiding the for loops.

Any help will be much appreciated!

Thanks!

I don't think there is a satisfactory way of vectorizing your problem without resorting to Cython or the like. Let me outline what a pure numpy solution could look like, which should make clear why this is probably not a very good approach.

First, lets look at a 1D case. There's not much you can do with a bunch of slices in numpy, so the first task is to expand them into individual indices. Say that your arrays were:

mat = np.zeros((10,))
x_min = np.array([2, 5, 3, 1])
x_max = np.array([5, 9, 8, 7])
value = np.array([0.2, 0.6, 0.1, 0.9])

Then the following code expands the slice limits into lists of (possibly repeating) indices and values, joins them together with bincount , and adds them to the original mat :

x_len = x_max - x_min
x_cum_len = np.cumsum(x_len)
x_idx = np.arange(x_cum_len[-1])
x_idx[x_len[0]:] -= np.repeat(x_cum_len[:-1], x_len[1:])
x_idx += np.repeat(x_min, x_len)
x_val = np.repeat(value, x_len)
x_cumval = np.bincount(x_idx, weights=x_val)
mat[:len(x_cumval)] += x_cumval

>>> mat
array([ 0. ,  0.9,  1.1,  1.2,  1.2,  1.6,  1.6,  0.7,  0.6,  0. ])

It is possible to expand this to your 2D case, although it is anything but trivial, and things start getting hard to follow:

mat = np.zeros((10, 10))
x_min = np.array([2, 5, 3, 1])
x_max = np.array([5, 9, 8, 7])
y_min = np.array([1, 7, 2, 6])
y_max = np.array([6, 8, 6, 9])
value = np.array([0.2, 0.6, 0.1, 0.9])

x_len = x_max - x_min
y_len = y_max - y_min
total_len = x_len * y_len
x_cum_len = np.cumsum(x_len)
x_idx = np.arange(x_cum_len[-1])
x_idx[x_len[0]:] -= np.repeat(x_cum_len[:-1], x_len[1:])
x_idx += np.repeat(x_min, x_len)
x_val = np.repeat(value, x_len)
y_min_ = np.repeat(y_min, x_len)
y_len_ = np.repeat(y_len, x_len)
y_cum_len = np.cumsum(y_len_)
y_idx = np.arange(y_cum_len[-1])
y_idx[y_len_[0]:] -= np.repeat(y_cum_len[:-1], y_len_[1:])
y_idx += np.repeat(y_min_, y_len_)
x_idx_ = np.repeat(x_idx, y_len_)
xy_val = np.repeat(x_val, y_len_)
xy_idx = np.ravel_multi_index((x_idx_, y_idx), dims=mat.shape)
xy_cumval = np.bincount(xy_idx, weights=xy_val)
mat.ravel()[:len(xy_cumval)] += xy_cumval

Which produces:

>>> mat
array([[ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ],
       [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0.9,  0.9,  0.9,  0. ],
       [ 0. ,  0.2,  0.2,  0.2,  0.2,  0.2,  0.9,  0.9,  0.9,  0. ],
       [ 0. ,  0.2,  0.3,  0.3,  0.3,  0.3,  0.9,  0.9,  0.9,  0. ],
       [ 0. ,  0.2,  0.3,  0.3,  0.3,  0.3,  0.9,  0.9,  0.9,  0. ],
       [ 0. ,  0. ,  0.1,  0.1,  0.1,  0.1,  0.9,  1.5,  0.9,  0. ],
       [ 0. ,  0. ,  0.1,  0.1,  0.1,  0.1,  0.9,  1.5,  0.9,  0. ],
       [ 0. ,  0. ,  0.1,  0.1,  0.1,  0.1,  0. ,  0.6,  0. ,  0. ],
       [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0.6,  0. ,  0. ],
       [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ]])

But if you have 265,000 two dimensional slices of arbitrary size, then the indexing arrays are going to get into the many millions of items really fast. Having to handle reading and writing so much data can negate the speed improvements that come with using numpy. Frankly, I doubt this is a good option at all, if nothing else because of how cryptic your code is going to become.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM