简体   繁体   中英

Pandas - adding a total row to each subgroup as first row

I know this questions has been mentioned various times on StackOverflow, I do not find it trivial to accomplish this task. This and many other answers: Adding total row to pandas DataFrame groupby

Sample of my data (there are actually 25 columns for this but they are similar, only numerical):

owner   player  val1    val1    val3
A       x       5.60    3.18    0.76
A       y       12.08   15.95   -0.24
A       z       0.03    0.05    -0.41
B       x       0.02    0.01    2.06
B       z       2.36    2.37    0.00
C       x       0.16    0.15    0.05
C       y       0.72    0.75    -0.04
D       x       0.33    0.56    -0.41

My intended output is as follows, where for each owner the total is calculated and placed as the first row in the subgroup.

owner   player  val1    val1    val3
A      total    17.71   19.18   0.11
A      x        5.60    3.18    0.76
A      y        12.08   15.95   -0.24
A      z        0.03    0.05    -0.41
B      total    2.38    2.38    2.05
B      x        0.02    0.01    2.06
B      z        2.36    2.37    0.00
C      total    0.88    0.90    0.01
C      x        0.16    0.15    0.05
C      y        0.72    0.75    -0.04
D      total    0.33    0.56    -0.41
D      x        0.33    0.56    -0.41

I attempted to use something that I also found on StackOverflow which looked like what I was searching for but I couldn't make it quite right.

def lambda_t(x):
    df = x.sort_values(['owner']).drop(['owner'],axis=1)
    df.loc['total'] = df.sum()
    return df

df.groupby(['owner']).apply(lambda_t)

While in theory this might have been something interesting, the total is not placed where I want and on top of that the values on player name are concatenating, so I end up having a really packed column. This way I end up having a multiindex.

owner       player  val1    val1    val3
A   0        x      5.60    3.18    0.76
    1        y      12.08   15.95   -0.24
    2        z      0.03    0.05    -0.41
    total    xzy    17.71   19.18   0.11
.....

Apparently, dropping level of a multiindex can help but I'm missing the total this way, it disappears.

df.groupby(['owner']).apply(lambda_t).droplevel(level=1)

owner       player  val1    val1    val3
A            x      5.60    3.18    0.76
A            y      12.08   15.95   -0.24
A            z      0.03    0.05    -0.41
A            xzy    17.71   19.18   0.11

Any ideas if this is possible? I've seen that with groupby, assign and loc you can't order them properly.

IIUC, you can use groupby.sum to compute the totals, assign the total name as player,concat the two DataFrames in order and sort_values with a stable method:

out = (pd
  .concat([df.groupby('owner', as_index=False).sum().assign(player='total'),
           df])
   .sort_values(by='owner', kind='stable', ignore_index=True)
   [df.columns]
)

output:

   owner player   val1    val1  val3
0      A  total  17.71   19.18  0.11
1      A      x   5.60    3.18  0.76
2      A      y  12.08   15.95 -0.24
3      A      z   0.03    0.05 -0.41
4      B  total   2.38    2.38  2.06
5      B      x   0.02    0.01  2.06
6      B      z   2.36    2.37  0.00
7      C  total   0.88    0.90  0.01
8      C      x   0.16    0.15  0.05
9      C      y   0.72    0.75 -0.04
10     D  total   0.33    0.56 -0.41
11     D      x   0.33    0.56 -0.41

Another possible solution:

(df.groupby('owner')
 .apply(lambda x:
 pd.concat(
     [pd.concat([pd.DataFrame({'owner': x.owner.unique(), 'player': ['total']}),
      pd.DataFrame(x.iloc[:, 2:].apply(sum, axis=0)).T], axis=1),
      x]
 ))).reset_index(drop=True)

Output:

   owner player   val1   val2  val3
0      A  total  17.71  19.18  0.11
1      A      x   5.60   3.18  0.76
2      A      y  12.08  15.95 -0.24
3      A      z   0.03   0.05 -0.41
4      B  total   2.38   2.38  2.06
5      B      x   0.02   0.01  2.06
6      B      z   2.36   2.37  0.00
7      C  total   0.88   0.90  0.01
8      C      x   0.16   0.15  0.05
9      C      y   0.72   0.75 -0.04
10     D  total   0.33   0.56 -0.41
11     D      x   0.33   0.56 -0.41

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM