简体   繁体   English

Python Dataframe 为基于组最大值的列设置值

[英]Python Dataframe setting values for a column based on groups maximum value

I have the following dataframe我有以下 dataframe

   id   Area Country
0  11  34.45  Norway
1  12  30.25      UK
2  13  16.70    Iran
3  11  35.45  Sweden
4  13  20.22    Iraq
5  15  35.12     USA

dfObj['BigCountry'] = ''
dfObj['SmallCountry'] = ''

Based on the area I want to classify the country either big or small.根据地区,我想对国家进行大或小分类。 I was trying to groupby id and then based on max(area) within the group I want to set small/big country我试图按 id 分组,然后基于组内的 max(area) 我想设置小/大国

The output should be output 应该是

   id  BigCountry  SmallCountry   
0  11  Sweden         Norway
1  12  UK             UK           
2  13  Iraq           Iran
5  15  USA            USA 

One way is to use set_index then groupby and agg with idxmax and idxmin :一种方法是使用set_index然后groupbyaggidxmaxidxmin

df.set_index('Country').groupby('id')['Area'].agg(['idxmax','idxmin'])\
  .rename(columns = {'idxmax':'BigCountry', 'idxmin':'SmallCountry'})

Output: Output:

   BigCountry SmallCountry
id                        
11     Sweden       Norway
12         UK           UK
13       Iraq         Iran
15        USA          USA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM