熊猫：在数据框中找到每个系列的最大值

Question

Consider this data: 考虑以下数据：

df = pd.DataFrame(np.random.randint(0,20,size=(5, 4)),
              columns=list('ABCD'),
              index=pd.date_range('2016-04-01', '2016-04-05'))

date       A   B   C   D
1/1/2016  15   5  19   2
2/1/2016  18   1  14  11
3/1/2016  10  16   8   8
4/1/2016   7  17  17  18
5/1/2016  10  15  18  18

where date is the index date是索引

what I want to get back is a tuple of (date, <max>, <series_name>) for each column: 我要返回的是每一列的(date, <max>, <series_name>)元组：

   2/1/2016, 18, 'A'
   4/1/2016, 17, 'B'
   1/1/2016, 19, 'C'
   4/1/2016, 18, 'D'

How can this be done in idiomatic pandas? 如何在惯用熊猫中做到这一点？

Answer 1

I think you can concat max and idxmax . 我认为您可以concat max和idxmax 。 Last you can reset_index , rename column index and reorder all columns: 最后，您可以reset_index ， rename列index并重新排列所有列：

print df
           A   B   C   D
date                    
1/1/2016  15   5  19   2
2/1/2016  18   1  14  11
3/1/2016  10  16   8   8
4/1/2016   7  17  17  18
5/1/2016  10  15  18  18

print pd.concat([df.max(),df.idxmax()], axis=1, keys=['max','date'])
   max      date
A   18  2/1/2016
B   17  4/1/2016
C   19  1/1/2016
D   18  4/1/2016

df = pd.concat([df.max(),df.idxmax()], axis=1, keys=['max','date'])
       .reset_index()
       .rename(columns={'index':'name'})
#change order of columns
df = df[['date','max','name']]
print df
       date  max name
0  2/1/2016   18    A
1  4/1/2016   17    B
2  1/1/2016   19    C
3  4/1/2016   18    D

Another solution with rename_axis (new in pandas 0.18.0 ): 另一个带有rename_axis解决方案（ pandas 0.18.0新功能）：

print pd.concat([df.max().rename_axis('name'), df.idxmax()], axis=1, keys=['max','date'])
      max      date
name               
A      18  2/1/2016
B      17  4/1/2016
C      19  1/1/2016
D      18  4/1/2016

df = pd.concat([df.max().rename_axis('name'), df.idxmax()], axis=1, keys=['max','date'])
       .reset_index()
#change order of columns
df = df[['date','max','name']]
print df
       date  max name
0  2/1/2016   18    A
1  4/1/2016   17    B
2  1/1/2016   19    C
3  4/1/2016   18    D

Answer 2

You could use idxmax and max with axis=0 for that and then join them: 您可以将idxmax和max与axis = 0一起使用，然后将它们加入：

np.random.seed(632)

df = pd.DataFrame(np.random.randint(0,20,size=(5, 4)), columns=list('ABCD'))

In [28]: df
Out[28]:
    A   B   C   D
0  10  14  16   1
1  12  13   8   8
2   8  16  11   1
3   8   1  17  12
4   4   2   1   7

In [29]: df.idxmax(axis=0)
Out[29]:
A    1
B    2
C    3
D    3
dtype: int64

In [30]: df.max(axis=0)
Out[30]:
A    12
B    16
C    17
D    12
dtype: int32


In [32]: pd.concat([df.idxmax(axis=0) , df.max(axis=0)], axis=1)
Out[32]:
   0   1
A  1  12
B  2  16
C  3  17
D  3  12

Answer 3

Setup 设定

import numpy as np
import pandas as pd

np.random.seed(314)
df = pd.DataFrame(np.random.randint(0,20,size=(5, 4)),
                  columns=list('ABCD'),
                  index=pd.date_range('2016-04-01', '2016-04-05'))

print df

             A   B   C   D
2016-04-01   8  13   9  19
2016-04-02  10  14  16   7
2016-04-03   2   7  16   3
2016-04-04  12   7   4   0
2016-04-05   4  13   8  16

Solution 解

stacked = df.stack()
stacked = stacked[stacked.groupby(level=1).idxmax()]

produces 产生

print stacked

2016-04-04  A    12
2016-04-02  B    14
            C    16
2016-04-01  D    19
dtype: int32

熊猫：在数据框中找到每个系列的最大值

问题描述

3 个解决方案

解决方案1
2 2016-04-29 06:47:28

解决方案2
2 2016-04-29 06:47:47

解决方案3
1 已采纳 2016-04-29 08:49:39

Setup 设定

Solution 解

produces 产生

熊猫：在数据框中找到每个系列的最大值

问题描述

3 个解决方案

解决方案1 2 2016-04-29 06:47:28

解决方案2 2 2016-04-29 06:47:47

解决方案3 1 已采纳 2016-04-29 08:49:39

Setup 设定

Solution 解

produces 产生

解决方案1
2 2016-04-29 06:47:28

解决方案2
2 2016-04-29 06:47:47

解决方案3
1 已采纳 2016-04-29 08:49:39