简体   繁体   中英

Multiple Element Indexing in multi-dimensional array

I have a 3d Numpy array and would like to take the mean over one axis considering certain elements from the other two dimensions.

This is an example code depicting my problem:

import numpy as np
myarray = np.random.random((5,10,30))
yy = [1,2,3,4]
xx = [20,21,22,23,24,25,26,27,28,29]
mymean = [ np.mean(myarray[t,yy,xx]) for t in np.arange(5) ]

However, this results in:

ValueError: shape mismatch: objects cannot be broadcast to a single shape

Why does an indexing like eg myarray[:,[1,2,3,4],[1,2,3,4]] work, but not my code above?

This is how you fancy-index over more than one dimension:

>>> np.mean(myarray[np.arange(5)[:, None, None], np.array(yy)[:, None], xx],
            axis=(-1, -2))
array([ 0.49482768,  0.53013301,  0.4485054 ,  0.49516017,  0.47034123])

When you use fancy indexing, ie a list or array as an index, over more than one dimension, numpy broadcasts those arrays to a common shape, and uses them to index the array. You need to add those extra dimensions of length 1 at the end of the first indexing arrays, for the broadcast to work properly. Here are the rules of the game .

Since you use consecutive elements you can use a slice:

import numpy as np
myarray = np.random.random((5,10,30))
yy = slice(1,5)
xx = slice(20, 30)
mymean = [np.mean(myarray[t, yy, xx]) for t in np.arange(5)]

To answer your question about why it doesn't work: when you use lists/arrays as indices, Numpy uses a different set of indexing semantics than it does if you use slices. You can see the full story in the documentation and, as that page says, it "can be somewhat mind-boggling".

If you want to do it for nonconsecutive elements, you must grok that complex indexing mechanism.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM