简体   繁体   中英

Python - argsort sorting incorrectly

What is the problem? Where am I doing wrong?

I am new to Python and I could not find the problem. Thanks a lot in advance for your help.

The code is

import numpy as np
users = [["Richard", 18],["Sophia", 16],["Kelly", 3],["Anna", 15],["Nicholas", 17],["Lisa", 2]]
users = np.array(users)
print(users[users[:, 1].argsort()])

Output should be

[['Lisa' '2']
['Kelly' '3']
['Anna' '15']
['Sophia' '16']
['Nicholas' '17']
['Richard' '18']]

But output is

[['Anna' '15']
['Sophia' '16']
['Nicholas' '17']
['Richard' '18']
['Lisa' '2']
['Kelly' '3']]

The numbers are being interpreted as strings (so '15' comes before '2', like 'ae' comes before 'b'). The fact that in the output, you see things like '15' with single quotes around it, is a clue to this.

In order to create a numpy array which has a mixture of data types (strings for the names, ints for the numbers), you can create the array this way, specifying the data type as object :

users = np.array(users, dtype=object)

That will give the output you're looking for.

When converting it to a np.array the integers are converted to strings and strings get sorted differently than numbers.

You could sort the lists first and then convert it to an array (if you really want that).

users = [["Richard", 18],["Sophia", 16],["Kelly", 3],["Anna", 15],["Nicholas", 17],["Lisa", 2]]
users_sorted = sorted(users, key=lambda x: x[1])
print(users_sorted)

[['Lisa', 2], ['Kelly', 3], ['Anna', 15], ['Sophia', 16], ['Nicholas', 17], ['Richard', 18]]

Try this if you don't want to change the dtype:

print(users[users[:, 1].astype(float).argsort()])

It should give you the result you are looking for. The answer given by slothrop is enough as an explanation

If there is a possibility of multiple people having the same number you probably want to use np. instead of np. 而不是np. , to first sort on the number and then by name: ,首先按数字排序,然后按名称排序:

import numpy as np

users = [["Richard", 18], ["Sophia", 16], ["Kelly", 3], ["Bob", 15], ["Anna", 15], ["Nicholas", 17], ["Lisa", 2]]
users = np.array(users, dtype=object)
sorted_users = users[np.lexsort((users[:,0], users[:,1]))]
print(sorted_users)

Output:

[['Lisa' 2]
 ['Kelly' 3]
 ['Anna' 15]
 ['Bob' 15]
 ['Sophia' 16]
 ['Nicholas' 17]
 ['Richard' 18]]

The equivalent without using numpy would be like this:

users = [["Richard", 18], ["Sophia", 16], ["Kelly", 3], ["Anna", 15], ["Nicholas", 17], ["Lisa", 2]]
sorted_users = sorted(users, key=lambda user: (user[1], user[0]))
print(sorted_users)

Output:

[['Lisa', 2], ['Kelly', 3], ['Anna', 15], ['Bob', 15], ['Sophia', 16], ['Nicholas', 17], ['Richard', 18]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM