![](/img/trans.png)
[英]How to combine the sum of one column based on multiple unique values of another column in Pandas DataFrame?
[英]Reordering pandas dataframe based on multiple column and sum of one column
我有一个愚蠢的。 数据框:
Country_FAO type mean_area
0 Afghanistan car 2029000.0
1 Afghanistan car 112000.0
2 Algeria bus 827000.0
3 Algeria bus 2351.0
4 Australia car 6475695.0
5 Australia car 12141000.0
6 Australia bus 293806.0
我想重新排序的总和的基础上,该数据帧mean_area
在每个值Country_FAO
列。 最终结果应如下所示:
Country_FAO type mean_area
0 Australia car 12141000.0
1 Australia car 6475695.0
2 Australia bus 293806.0
3 Afghanistan car 2029000.0
4 Afghanistan car 112000.0
5 Algeria bus 827000.0
6 Algeria bus 2351.0
澳大利亚mean_area
第一,因为其 3 个类别的mean_area
值的总和最高。
我试过这个:
df_stacked.sort(['Country_FAO', 'mean_area'], ascending=[False, False])
但这不起作用,它不会在进行排序之前将所有mean_area
。
我认为您需要创建新的列,按groupby
sort
,然后使用transform
然后sort_values
。 最后你可以用reset_index
drop
它:
df['sort'] = df.groupby('Country_FAO')['mean_area'].transform(sum)
df['sort'] = df.groupby('Country_FAO')['mean_area'].transform(sum)
df1 = df.sort_values(['sort','Country_FAO', 'mean_area'], ascending=False)
print df1
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
2 Algeria bus 827000.0 829351.0
3 Algeria bus 2351.0 829351.0
df1 = df1.drop('sort', axis=1).reset_index(drop=True)
print df1
Country_FAO type mean_area
0 Australia car 12141000.0
1 Australia car 6475695.0
2 Australia bus 293806.0
3 Afghanistan car 2029000.0
4 Afghanistan car 112000.0
5 Algeria bus 827000.0
6 Algeria bus 2351.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.