简体   繁体   中英

Having trouble figuring out how to use argsort to create an index on one array and use it to sort another (without flattening)

I have an array called ranks that I would like to sort by column. I would like to use the index of that sort to rearrange an array with same dimensions called unsorted . I would then like to select the top 2 rows of unsorted.

This is an example of what I have so far that doesn't work:

import numpy as np

ranks = np.random.uniform(0,1,(10,5))
unsorted = np.random.uniform(0,1,(10,5))

ind = np.argsort(ranks,axis = 1)
sorted = unsorted[ind]
sorted = sorted[0:2,:]

Speed is also an issue as it will be applied to large arrays (50,000 x 5,000).

First make sure we know how to apply argsort to ranks :

In [222]: ranks = np.random.randint(0,10,(4,5)) 
     ...: unsorted = np.random.randint(0,10,(4,5)) 
     ...:  
     ...: ind = np.argsort(ranks,axis = 1)                                                           
In [223]: ranks                                                                                      
Out[223]: 
array([[5, 9, 4, 8, 6],
       [8, 6, 7, 3, 1],
       [1, 2, 3, 4, 8],
       [6, 0, 0, 5, 0]])
In [224]: ind                                                                                        
Out[224]: 
array([[2, 0, 4, 3, 1],
       [4, 3, 1, 2, 0],
       [0, 1, 2, 3, 4],
       [1, 2, 4, 3, 0]])
In [225]: np.take_along_axis(ranks, ind, axis=1)                                                     
Out[225]: 
array([[4, 5, 6, 8, 9],
       [1, 3, 6, 7, 8],
       [1, 2, 3, 4, 8],
       [0, 0, 0, 5, 6]])

Here each row is ordered.

The pre-take_along method (still works fine) was:

In [226]: ranks[np.arange(4)[:,None], ind]                                                           
Out[226]: 
array([[4, 5, 6, 8, 9],
       [1, 3, 6, 7, 8],
       [1, 2, 3, 4, 8],
       [0, 0, 0, 5, 6]])

Obviously we could apply this to unsorted , though I know what you mean by the top two rows . What's the two rows of [226]?

Sorting on 2d arrays is tricky; it's hard to visualize what's happening. I changed your example to use integers and small shape to better visualize the action.

unsorted[ind] is not right. ind in this case has values 0...4, the number of columns. It can't be used to index the first dimension (rows). In my reduced example the 4 is too large. Your example runs, but the shape is off (10,5,5) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM