简体   繁体   中英

Sort numpy 2d array by multiple columns

I have a 2D numpy array that looks like this

array([[5, 0],
       [3, 1],
       [7, 0],
       [2, 1]])

I'd like to (sub) sort by each column (say right to left) to get this:

array([[5, 0],
       [7, 0],
       [2, 1],
       [3, 1]])

How can I do that in numpy?

Numpy includes a native function for sub-sorting by columns, lexsort :

idx = np.lexsort((arr[:,0], arr[:,1]))
arr_sorted = arr[idx]

Alternatively, you can use pandas syntax if you're more familiar; this will have some memory/time overhead but should be small for < 1m rows:

arr = [
    [5,  0],
    [3,  1],
    [7,  0],
    [2,  1]
]
df = pd.DataFrame(data=arr).sort_values([1,0])
arr_sorted = df.to_numpy()

output (both):

array([[5, 0],
       [7, 0],
       [2, 1],
       [3, 1]])

You can use np.lexsort to sort an array on multiple columns:

idx = np.lexsort((a[:,0], a[:,1]))

a[idx]

Output:

array([[5, 0], 
       [7, 0],
       [2, 1],
       [3, 1]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM