简体   繁体   中英

Building a tuple containing colons to be used to index a numpy array

I've created a class for dealing with multidimensional data of a specific type. This class has three attributes: A list containing the names of the axes ( self.axisNames ); a dictionary containing the parameter values along each axis ( self.axes ; keyd using the entries in axisNames); and a numpy array containing the data, with a dimension for each axis ( self.intensityArr ).

The class also has functions that dynamically add new axes depending on what I need for a specific case, which makes indexing the intensityArr a tricky proposition. To make indexing better I've started writing a function to build the index I need:

Inside class:

def indexIntensityArr(self,indexSpec):
    # indexSpec is a dictionary containing axisName:indexVal entries (indexVal is an int)
    # I want the function to return a tuple for use in indexing (see below def)
    indexList = []
    for axis in self.axisNames:
        if axis in indexSpec:
            indexList.append(indexSpec[axis])
        else:
            # <do something to add : to index list (tuple)>
    return tuple(indexList)

Outside class:

# ... create an instance of my class called myBlob with 4 dimensions ...

mySpec = {'axis1':10,'axis3':7}
mySlicedArr = myBlob.intensityArr[myBlob.indexIntensityArr(mySpec)]

I expect the above to result in mySlicedArr being a 2-dimensional array.

What do I need to put in the 'else' clause to get an : (or equivalent) in the tuple I use to index the intensityArr? Is this perhaps a bad way to solve the problem?

Inside indexing [] , a : is translated to a slice , and the whole thing is passed to __getitem__ as a tuple

indexList = []
for axis in self.axisNames:
    if axis in indexSpec:
        indexList.append(indexSpec[axis])
    else:
        indexList.append(slice(None))

There are several numpy functions that use an indexing trick like this - that is build up a tuple of index values and slices. Or if they need to vary it, they'll start with a list, which can mutate, and convert it to a tuple right before use. (eg np.apply_along_axis )

Yes, the full spec for slice is slice(start, stop, step) , with start and stop optional. Same as for np.arange or range . And None is equivalent to the unspecified values in a : expression.

A little custom class in np.lib.index_tricks.py translates the : notation into slices:

In [61]: np.s_[:,:1,0:,::3]
Out[61]: 
(slice(None, None, None),
 slice(None, 1, None),
 slice(0, None, None),
 slice(None, None, 3))

To add to hpaulj's answer, you can very simply extend your setup to make it even more generic by using np.s_ . The advantage of using this over slice is that you can use numpy 's slice syntax more easily and transparently. For example:

mySpec = {'axis1': np.s_[10:15], 'axis3': np.s_[7:8]}
mySlicedArr = myBlob.intensityArr[myBlob.indexIntensityArr(mySpec)]

(Extra info: np.s_[7:8] retrieves only the 7th column, but it preserves the dimension, ie your sliced array will still be 4D with a shape of 1 in that dimension: very useful for broadcasting).

And if you want to use the same syntax in your function definition as well:

indexList = []
for axis in self.axisNames:
    if axis in indexSpec:
        indexList.append(indexSpec[axis])
    else:
        indexList.append(np.s_[:])
return tuple(indexList)

All of this can be done equally well with slice . You would specify np.s_[10:15] as slice(10, 15) , and np.s_[:] as slice(None) , as hpaulj says.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM