简体   繁体   English

pandas dataframe 中最大值的索引和列

[英]index and column for the max value in pandas dataframe

I have a python dataframe df with five columns and five rows.我有一个 python dataframe df,有五列和五行。 I would like to get the row and column name of the max three values我想获取最大三个值的行和列名称

Example:例子:

df = 

  A   B  C  D  E  F
1 00 01 02 03 04 05
2 06 07 08 09 10 11
3 12 13 14 15 16 17
4 18 19 20 21 22 23
5 24 25 26 27 28 29

The output show say something like [5,F],[5,E],[5,D] output 显示类似 [5,F],[5,E],[5,D]

You could use unstack before sorting: 你可以在排序之前使用unstack

>>> df
    A   B   C   D   E   F
1   0   1   2   3   4   5
2   6   7   8   9  10  11
3  12  13  14  15  16  17
4  18  19  20  21  22  23
5  24  25  26  27  28  29
>>> df.unstack()
A  1     0
   2     6
   3    12
   4    18
   5    24
B  1     1
   2     7
   3    13
   4    19
   5    25
[...]
F  1     5
   2    11
   3    17
   4    23
   5    29

and so 所以

>>> df2 = df.unstack().copy()
>>> df2.sort()
>>> df2[-3:]
D  5    27
E  5    28
F  5    29
>>> df2[-3:].index
MultiIndex
[(D, 5.0), (E, 5.0), (F, 5.0)]

or even 甚至

>>> df.unstack()[df.unstack().argsort()].index[-3:]
MultiIndex
[(D, 5.0), (E, 5.0), (F, 5.0)]

[I didn't bother reversing the order: sticking [::-1] at the end should do it.] [我没有打扰逆转顺序:最后坚持[::-1]应该这样做。

I am not going to pretend these are the most efficient way of dealing with this problem, but I though they are worth mentioning:我不会假装这些是处理这个问题的最有效方法,但我认为它们值得一提:

df

    A   B   C   D   E   F
1   0   1   2   3   4   5
2   6   7   8   9  10  11
3  12  13  14  15  16  17
4  18  19  20  21  22  23
5  24  25  26  27  28  29

Using df.max() to get the maximum value of each column and then sorting values and getting the biggest numbers.使用df.max()获取每列的最大值,然后对值进行排序并获取最大的数字。 Then masking them against the original df and returning the values.然后根据原始 df 屏蔽它们并返回值。 A list comprehension can is finally used to get the indices:最终使用列表理解可以获取索引:

df_2 = df[df.max().sort_values(ascending=True).tail(3).eq(df)]
[(i, df_2[i].first_valid_index()) for i in df_2.columns if df_2[i].first_valid_index() != None]

Output: Output:

[('D', 5), ('E', 5), ('F', 5)]

or或者

s = df_2.apply(pd.Series.first_valid_index).dropna()
list(zip(s.index, s.astype(int)))

Output: Output:

[('D', 5), ('E', 5), ('F', 5)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM