简体   繁体   中英

Reordering pandas dataframe based on multiple column and sum of one column

I have the foll. dataframe:

              Country_FAO            type   mean_area
0             Afghanistan             car   2029000.0
1             Afghanistan             car    112000.0
2                 Algeria             bus    827000.0
3                 Algeria             bus      2351.0
4               Australia             car   6475695.0
5               Australia             car  12141000.0
6               Australia             bus    293806.0

I would like to reorder this dataframe on the basis of sum of mean_area for each value in the Country_FAO column. The end result should look like this:

              Country_FAO            type   mean_area
0               Australia             car  12141000.0
1               Australia             car   6475695.0
2               Australia             bus    293806.0
3             Afghanistan             car   2029000.0
4             Afghanistan             car    112000.0
5                 Algeria             bus    827000.0
6                 Algeria             bus      2351.0

Australia comes first because the sum of mean_area values for its 3 categories is the highest.

I tried this:

df_stacked.sort(['Country_FAO', 'mean_area'], ascending=[False, False])

This does not work though, it does not add up all the mean_area s before doing the sort.

I think you need create new column sort by groupby with transform and then sort_values . Last you can drop it with reset_index :

df['sort'] = df.groupby('Country_FAO')['mean_area'].transform(sum)

df['sort'] = df.groupby('Country_FAO')['mean_area'].transform(sum)

df1 = df.sort_values(['sort','Country_FAO', 'mean_area'], ascending=False)
print df1
   Country_FAO type   mean_area        sort
5    Australia  car  12141000.0  18910501.0
4    Australia  car   6475695.0  18910501.0
6    Australia  bus    293806.0  18910501.0
0  Afghanistan  car   2029000.0   2141000.0
1  Afghanistan  car    112000.0   2141000.0
2      Algeria  bus    827000.0    829351.0
3      Algeria  bus      2351.0    829351.0

df1 = df1.drop('sort', axis=1).reset_index(drop=True)
print df1
   Country_FAO type   mean_area
0    Australia  car  12141000.0
1    Australia  car   6475695.0
2    Australia  bus    293806.0
3  Afghanistan  car   2029000.0
4  Afghanistan  car    112000.0
5      Algeria  bus    827000.0
6      Algeria  bus      2351.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM