Python Dataframe 为基于组最大值的列设置值

Question

I have the following dataframe我有以下 dataframe

   id   Area Country
0  11  34.45  Norway
1  12  30.25      UK
2  13  16.70    Iran
3  11  35.45  Sweden
4  13  20.22    Iraq
5  15  35.12     USA

dfObj['BigCountry'] = ''
dfObj['SmallCountry'] = ''

Based on the area I want to classify the country either big or small.根据地区，我想对国家进行大或小分类。 I was trying to groupby id and then based on max(area) within the group I want to set small/big country我试图按 id 分组，然后基于组内的 max(area) 我想设置小/大国

The output should be output 应该是

   id  BigCountry  SmallCountry   
0  11  Sweden         Norway
1  12  UK             UK           
2  13  Iraq           Iran
5  15  USA            USA

Answer 1

One way is to use set_index then groupby and agg with idxmax and idxmin :一种方法是使用set_index然后groupby和agg与idxmax和idxmin ：

df.set_index('Country').groupby('id')['Area'].agg(['idxmax','idxmin'])\
  .rename(columns = {'idxmax':'BigCountry', 'idxmin':'SmallCountry'})

Output: Output：

   BigCountry SmallCountry
id                        
11     Sweden       Norway
12         UK           UK
13       Iraq         Iran
15        USA          USA

Python Dataframe 为基于组最大值的列设置值

问题描述

1 个解决方案

解决方案1
0 2019-10-19 21:32:54

Python Dataframe 为基于组最大值的列设置值

问题描述

1 个解决方案

解决方案1 0 2019-10-19 21:32:54

解决方案1
0 2019-10-19 21:32:54