[英]Sort 2d numpy string/float array by column
I have the following numpy array (of numbers of rocket launches pr country since 1957), and I would like to sort it in ascending order on number of launches.我有以下 numpy 数组(自 1957 年以来国家/地区的火箭发射次数),我想按发射次数的升序对其进行排序。
['Australia', 6.0],
['Brazil', 3.0],
['China', 269.0],
['France', 303.0],
['India', 76.0],
['Iran', 14.0],
['Israel', 11.0],
['Japan', 126.0],
['Kazakhstan', 701.0],
['Kenya', 9.0],
['New Zealand', 13.0],
['North Korea', 5.0],
['Pacific Ocean', 36.0],
['Russian Federation', 1398.0],
['South Korea', 3.0],
['USA', 1351.0]
Problem is, np.sort(a, axis = 0) only sort the values, but countries are not linked, so ei North Korea has launched 269 rockets (which is probably more likely than 5)问题是, np.sort(a,axis = 0) 只对数值进行排序,但国家之间没有联系,所以 ei 朝鲜发射了 269 枚火箭(可能比 5 枚更有可能)
Or, if I do np.sort(a, axis = 1) then I get an error saying或者,如果我执行 np.sort(a,axis = 1) 然后我收到一个错误说
TypeError: '<' not supported between instances of 'float' and 'str'类型错误:“float”和“str”的实例之间不支持“<”
Any Ideas would be very much appreciated!任何想法将不胜感激!
import numpy as np
data = [
['Australia', 6.0],
['Brazil', 3.0],
['China', 269.0],
['France', 303.0],
['India', 76.0],
['Iran', 14.0],
['Israel', 11.0],
['Japan', 126.0],
['Kazakhstan', 701.0],
['Kenya', 9.0],
['New Zealand', 13.0],
['North Korea', 5.0],
['Pacific Ocean', 36.0],
['Russian Federation', 1398.0],
['South Korea', 3.0],
['USA', 1351.0]
]
We can create a structured array and then sort it by keys:我们可以创建一个结构化数组,然后按键对其进行排序:
dtype = [
('name', '<U18'),
('rockets', float)
]
data = np.array([tuple(x) for x in data], dtype=dtype)
sorted_data = np.sort(data, order=['rockets'])
print(sorted_data)
This is easy with python list sorting:这很容易使用 python 列表排序:
In [208]: alist = [ ['Australia', 6.0],
...: ['Brazil', 3.0],
...: ['China', 269.0],
...: ['France', 303.0],
...: ['India', 76.0],
...: ['Iran', 14.0],
...: ['Israel', 11.0],
...: ['Japan', 126.0],
...: ['Kazakhstan', 701.0],
...: ['Kenya', 9.0],
...: ['New Zealand', 13.0],
...: ['North Korea', 5.0],
...: ['Pacific Ocean', 36.0],
...: ['Russian Federation', 1398.0],
...: ['South Korea', 3.0],
...: ['USA', 1351.0]]
In [209]: newlist = sorted(alist, key=lambda x: x[1])
In [210]: newlist
Out[210]:
[['Brazil', 3.0],
['South Korea', 3.0],
['North Korea', 5.0],
['Australia', 6.0],
['Kenya', 9.0],
['Israel', 11.0],
['New Zealand', 13.0],
['Iran', 14.0],
['Pacific Ocean', 36.0],
['India', 76.0],
['Japan', 126.0],
['China', 269.0],
['France', 303.0],
['Kazakhstan', 701.0],
['USA', 1351.0],
['Russian Federation', 1398.0]]
With an object dtype array (to preserved string and float columns):使用对象 dtype 数组(保留字符串和浮点列):
In [211]: arr = np.array(alist, object)
In [212]: arr
Out[212]:
array([['Australia', 6.0],
['Brazil', 3.0],
['China', 269.0],
['France', 303.0],
...
['USA', 1351.0]], dtype=object)
Get a sorting index by just looking at the 2nd column:只需查看第二列即可获得排序索引:
In [213]: idx = np.argsort(arr[:,1])
In [214]: idx
Out[214]: array([ 1, 14, 11, 0, 9, 6, 10, 5, 12, 4, 7, 2, 3, 8, 15, 13])
In [215]: arr[idx]
Out[215]:
array([['Brazil', 3.0],
['South Korea', 3.0],
['North Korea', 5.0],
['Australia', 6.0],
['Kenya', 9.0],
...
['Russian Federation', 1398.0]], dtype=object)
The structured array approach in the other answer is fine too.另一个答案中的结构化数组方法也很好。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.