python pandas 3个最小值和3个最大值

Question

How can I find the index of the 3 smallest and 3 largest values in a column in my pandas dataframe? 如何在熊猫数据框中的一列中找到3个最小和3个最大值的索引？ I saw ways to find max and min, but none to get the 3. 我看到了找到最大值和最小值的方法，但是没有找到最大值和最小值的方法。

Answer 1

What have you tried? 你尝试了什么？ You could sort with s.sort() and then call s.head(3).index and s.tail(3).index . 您可以使用s.sort()排序，然后调用s.head(3).index和s.tail(3).index 。

Answer 2

You want to take a look at argsort (in numpy and in pandas ) 您想看一下argsort （以numpy和pandas表示）

df = pd.DataFrame(np.random.randint(1,100,100).reshape(10,10))
# bottom three indexes
df[0].argsort().values[:3]    
# top three indexes
df[0].argsort().values[-3:]

Answer 3

With smaller Series, you're better off just sorting then taking head/tail! 对于较小的Series，最好先分类然后再取个头/尾！

This is a pandas feature request , should see in 0.14 (need to overcome some fiddly bits with different dtypes), an efficient solution for larger Series (> 1000 elements) is using kth_smallest from pandas algos (warning this function mutates the array it's applied to so use a copy!): 这是一个大熊猫特征请求，在0.14（需要克服具有不同dtypes一些繁琐的比特），对于较大的系列（> 1000种元素）的有效解决方案应该看到利用kth_smallest从大熊猫交易算法（警告该功能变异它施加于阵列因此请使用副本！）：

In [11]: s = pd.Series(np.random.randn(10))

In [12]: s
Out[12]: 
0    0.785650
1    0.969103
2   -0.618300
3   -0.770337
4    1.532137
5    1.367863
6   -0.852839
7    0.967317
8   -0.603416
9   -0.889278
dtype: float64

In [13]: n = 3

In [14]: pd.algos.kth_smallest(s.values.astype(float), n - 1)
Out[14]: -0.7703374582084163

In [15]: s[s <= pd.algos.kth_smallest(s.values.astype(float), n - 1)]
Out[15]: 
3   -0.770337
6   -0.852839
9   -0.889278
dtype: float64

If you want this in order: 如果要按顺序进行此操作：

In [16]: s[s <= pd.algos.kth_smallest(s.values.astype(float), n - 1)].order()
Out[16]: 
9   -0.889278
6   -0.852839
3   -0.770337
dtype: float64

If you're worried about duplicates (join nth place) you can take the head: 如果您担心重复（排在第n位），可以采取行动：

In [17]: s[s <= pd.algos.kth_smallest(s.values.astype(float), n - 1)].order().head(n)
Out[17]: 
9   -0.889278
6   -0.852839
3   -0.770337
dtype: float64

Answer 4

In [55]: import numpy as np               

In [56]: import pandas as pd              

In [57]: s = pd.Series(np.random.randn(5))

In [58]: s
Out[58]: 
0    0.152037
1    0.194204
2    0.296090
3    1.071013
4   -0.324589
dtype: float64

In [59]: s.nsmallest(3) ## s.drop_duplicates().nsmallest(3); if duplicates exists               
Out[59]: 
4   -0.324589
0    0.152037
1    0.194204
dtype: float64

In [60]: s.nlargest(3) ## s.drop_duplicates().nlargest(3); if duplicates exists             
Out[60]: 
3    1.071013
2    0.296090
1    0.194204
dtype: float64

Answer 5

import pandas as pd
import numpy as np
np.random.seed(1)
x=np.random.randint(1,100,10)
y=np.random.randint(1000,10000,10)

x
array([38, 13, 73, 10, 76,  6, 80, 65, 17,  2])
y
array([8751, 4462, 6396, 6374, 3962, 3516, 9444, 4562, 5764, 9093])

data=pd.DataFrame({"age":x,
               "salary":y})


data.nlargest(5,"age").nsmallest(5,"salary")

python pandas 3个最小值和3个最大值

问题描述

5 个解决方案

解决方案1
5 2013-12-06 03:57:22

解决方案2
1 2013-12-06 04:05:12

解决方案3
1 2013-12-06 08:02:57

解决方案4
0 2017-06-05 21:57:34

解决方案5
0 2018-08-23 16:16:35

python pandas 3个最小值和3个最大值

问题描述

5 个解决方案

解决方案1 5 2013-12-06 03:57:22

解决方案2 1 2013-12-06 04:05:12

解决方案3 1 2013-12-06 08:02:57

解决方案4 0 2017-06-05 21:57:34

解决方案5 0 2018-08-23 16:16:35

解决方案1
5 2013-12-06 03:57:22

解决方案2
1 2013-12-06 04:05:12

解决方案3
1 2013-12-06 08:02:57

解决方案4
0 2017-06-05 21:57:34

解决方案5
0 2018-08-23 16:16:35