简体   繁体   中英

get a vector from a matrix and a vactor of index in numpy

I have a matrix m = [[1,2,3],[4,5,6],[7,8,9]] and a vector v=[1,2,0] that contains the indices of the rows I want to return for each column of my matrix.

the results I expect should be r=[4,8,3] , but I can not find out how to get this result using numpy.

By applying the vector to the index, for each columns I get this: m[v,[0,1,2]] = [4, 8, 3] , which is roughly my quest.

To prevent hardcoding the columns, I'm using np.arange(m.shape[1]) and the my final formula looks like r=m[v,np.arange(m.shape[1])]

This sounds weird to me and a little complicated for something that should be quite common.

Is there a clean way to get such result?

In [157]: m = np.array([[1,2,3],[4,5,6],[7,8,9]]);v=np.array([1,2,0])
In [158]: m
Out[158]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
In [159]: v
Out[159]: array([1, 2, 0])
In [160]: m[v,np.arange(3)]
Out[160]: array([4, 8, 3])

We are choosing 3 elements, with indices (1,0),(2,1),(0,2).

Closer to the MATLAB approach:

In [162]: np.ravel_multi_index((v,np.arange(3)),(3,3))
Out[162]: array([3, 7, 2])
In [163]: m.flat[_]
Out[163]: array([4, 8, 3])

Octave/MATLAB equivalent

>> m = [1 2 3;4 5 6;7 8 9];
>> v = [2 3 1]
v =

   2   3   1

>> m = [1 2 3;4 5 6;7 8 9];
>> v = [2 3 1];
>> sub2ind([3,3],v,[1 2 3])
ans =

   2   6   7

>> m(sub2ind([3,3],v,[1 2 3]))
ans =

   4   8   3

The same broadcasting is used to access a block, as illustrated in this recent question:

Is there a way in Python to get a sub matrix as in Matlab?

Well, this 'weird/complicated' thing is actually mentioned as a "straight forward" scenario, in the documentation of Integer array andexing , which is a sub-topic under the broader topic of "Advanced Indexing".

To quote some extract:

When the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing. Advanced indexes always are broadcast and iterated as one. Note that the result shape is identical to the (broadcast) indexing array shapes Blockquote

If it makes it seem any less complicated/weird, you could use range(m.shape[1]) instead of np.arange(m.shape[1]) . It just needs to be any array or array-like structure.

Visualization / Intuition:

When I was learning this (integer array indexing), it helped me to visualize things in the following way:

I visualized the indexing arrays standing side-by-side, all having exactly the same shape (perhaps as a consequence of getting broadcasted together). I also visualized the result array, which also has the same shape as the indexing arrays. In each of these indexing arrays and the result array, I visualized a monkey, capable of doing a walk-through of its own array, hopping to successive elements of its own array. Note that, in general, this identical shape of the indexing arrays and the result array, can be n-dimensional, and this identical shape can be very different from the shape of the source array whose values are actually being indexed.

In your own example, the source array m has shape (3,3) , and the indexing arrays and the result array each have a shape of (3,) .

Inn your example, there is a monkey in each of those three arrays (the two indexing arrays and the result array). We then visualize the monkeys doing a walk-through of their respective array elements in tandem . Here, "in tandem" means all the three monkeys start at the first element of their respective arrays, and whenever a monkey hops to the next element of its own array, the other monkeys in the other arrays also hop to the next element in their respective arrays. As it hops to each successive element, the monkey in each indexing array calls out the value of the element it has just visited. So the two monkeys in the two indexing arrays read out the values they've just visited, in their respective indexing arrays. The monkey in the result array also hops in tandem with the monkeys in the indexing arrays. It hears the values being called out by the monkeys in the indexing arrays, uses those values as indices into the source array m , and thus determines the value to be picked from source array m . The monkey in the result array picks up this value from the source array m , and stores it the value in the result array, at the location it has just hopped to. Thus, for example, when all the three monkeys are in the second element of their respective arrays, the second position of the result array would get its value determined.

As stated by the numpy documentation , I think the way you mentioned is the standard way to do this task:

Example
From each row, a specific element should be selected. The row index is just [0, 1, 2] and the column index specifies the element to choose for the corresponding row, here [0, 1, 0]. Using both together the task can be solved using advanced indexing:

x = np.array([[1, 2], [3, 4], [5, 6]])
x[[0, 1, 2], [0, 1, 0]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM