简体   繁体   中英

GroupBy one column, custom operation on another column of grouped records in pandas

I wanted to apply a custom operation on a column by grouping the values on another column. Group by column to get the count, then divide the another column value with this count for all the grouped records.

My Data Frame:

   emp opp amount
0  a   1   10
1  b   1   10
2  c   2   30
3  b   2   30
4  d   2   30

My scenario:

  • For opp=1, two emp's worked(a,b). So the amount should be shared like 10/2 =5
  • For opp=2, two emp's worked(b,c,d). So the amount should be like 30/3 = 10

Final Output DataFrame:

      emp opp amount
    0  a   1   5
    1  b   1   5
    2  c   2   10
    3  b   2   10
    4  d   2   10

What is the best possible to do so

df['amount'] = df.groupby('opp')['amount'].transform(lambda g: g/g.size)

df
#  emp  opp amount
# 0  a    1      5
# 1  b    1      5
# 2  c    2     10
# 3  b    2     10
# 4  d    2     10

Or:

df['amount'] = df.groupby('opp')['amount'].apply(lambda g: g/g.size)

does similar thing.

You could try something like this:

df2 = df.groupby('opp').amount.count()
df.loc[:, 'calculated'] = df.apply( lambda row: \
                                  row.amount / df2.ix[row.opp], axis=1)
df

Yields:

  emp  opp  amount  calculated
0   a    1      10           5
1   b    1      10           5
2   c    2      30          10
3   b    2      30          10
4   d    2      30          10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM