简体   繁体   中英

Pandas group data frame and sort by column value

I am trying to group a data frame and sort it at the same time by the absolute value of a certain column.

        groups values foo bar
75       A      3      1   2
77       B     -3      31  34
112      A      4      0   4
129      C      50     5   3
134      C     -60     44  5

On the whole data frame I can use

df.reindex(df.values.abs().sort_values(ascending=False).index)

This works perfectly fine. However, for the grouped data frame this obviously does not work.

When I try,

df.groupby('groups')['values'].reindex(df.values.abs().sort_values(ascending=False).index)

I get the expected error:

AttributeError: Cannot access callable attribute 'reindex' of 'SeriesGroupBy' objects, try using the 'apply' method

Trying apply probably requires to make another column for the absolute values but I do not want to add this. Is there a neat way to implement that?

The desired output would be a grouped data frame (object) which is sorted for the values column:

   for groups, data in df_grouped:
        print group, data
A,
       values foo bar
75      3      1   2
112     4      0   4
B,
       values foo bar
77      -3     31  34
C,
       values foo bar
134     -60    44  5
129     50     5   3

UPDATE2:

In [433]: for g,x in grp:
   .....:     print(g, x)
   .....:
A     groups  values  foo  bar
112      A       4    0    4
77       A       3    1    2
B    groups  values  foo  bar
77      B      -3   31   34
C     groups  values  foo  bar
134      C     -60   44    5
129      C      50    5    3

UPDATE: ready for grouping:

In [428]: grp = (df.assign(abs_val=df['values'].abs())
   .....:          .sort_values(['groups','abs_val'], ascending=[1,0])
   .....:          .drop('abs_val', 1)
   .....:          .groupby('groups'))

In [429]: grp.agg({'foo': ['first','last'], 'bar': ['min','mean','max']})
Out[429]:
         foo      bar
       first last min mean max
groups
A          0    1   2    3   4
B         31   31  34   34  34
C         44    5   3    4   5

OLD answer:

In [393]: df.assign(abs_val=df['values'].abs()).sort_values(['groups','abs_val'], ascending=[1,0]).drop('abs_val', 1)
Out[393]:
    groups  values
112      A       4
77       A       3
77       B      -3
134      C     -60
129      C      50

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM