汇总熊猫数据框中的值

Question

I want to calculate the maximum value for each year and show the sector and that value. 我想计算每年的最大值，并显示该部门和该值。 For example, from the screenshot, I would like to display: 2010: Telecom 781 2011: Tech 973 例如，从屏幕截图中，我要显示：2010：电信781 2011：技术973

I have tried using: df.groupby(['Year', 'Sector'])['Revenue'].max() 我尝试使用：df.groupby（['Year'，'Sector']）['Revenue']。max（）

but this does not give me the name of Sector which has the highest value. 但这并没有给我提供最高价值的部门名称。

Answer 1

Try using idxmax and loc : 尝试使用idxmax和loc ：

df.loc[df.groupby(['Sector','Year'])['Revenue'].idxmax()]

MVCE: MVCE：

import pandas as pd
import numpy as np

np.random.seed(123)
df = pd.DataFrame({'Sector':['Telecom','Tech','Financial Service','Construction','Heath Care']*3,
                   'Year':[2010,2011,2012,2013,2014]*3,
                   'Revenue':np.random.randint(101,999,15)})

df.loc[df.groupby(['Sector','Year'])['Revenue'].idxmax()]

Output: 输出：

               Sector  Year  Revenue
3        Construction  2013      423
12  Financial Service  2012      838
9          Heath Care  2014      224
1                Tech  2011      466
5             Telecom  2010      843

Answer 2

Also .sort_values + .tail , grouping on just year. 还有.sort_values + .tail ，仅按年份分组。 Data from @Scott Boston 来自@Scott Boston的数据

df.sort_values('Revenue').groupby('Year').tail(1)

Output: 输出：

               Sector  Year  Revenue
9          Heath Care  2014      224
3        Construction  2013      423
1                Tech  2011      466
12  Financial Service  2012      838
5             Telecom  2010      843

汇总熊猫数据框中的值

问题描述

2 个解决方案

解决方案1
2 2018-10-10 02:51:13

解决方案2
2 2018-10-10 02:59:30

汇总熊猫数据框中的值

问题描述

2 个解决方案

解决方案1 2 2018-10-10 02:51:13

解决方案2 2 2018-10-10 02:59:30

解决方案1
2 2018-10-10 02:51:13

解决方案2
2 2018-10-10 02:59:30