简体   繁体   English

GroupBy一列,对pandas中另一列分组记录进行自定义操作

[英]GroupBy one column, custom operation on another column of grouped records in pandas

I wanted to apply a custom operation on a column by grouping the values on another column. 我想通过将值分组到另一列上来对列应用自定义操作。 Group by column to get the count, then divide the another column value with this count for all the grouped records. 按列分组以获取计数,然后将所有分组记录的另一列值除以此计数。

My Data Frame: 我的数据框架:

   emp opp amount
0  a   1   10
1  b   1   10
2  c   2   30
3  b   2   30
4  d   2   30

My scenario: 我的情景:

  • For opp=1, two emp's worked(a,b). 对于opp = 1,两个emp工作(a,b)。 So the amount should be shared like 10/2 =5 所以金额应该像10/2 = 5一样分享
  • For opp=2, two emp's worked(b,c,d). 对于opp = 2,两个emp工作(b,c,d)。 So the amount should be like 30/3 = 10 所以金额应该是30/3 = 10

Final Output DataFrame: 最终输出数据框架:

      emp opp amount
    0  a   1   5
    1  b   1   5
    2  c   2   10
    3  b   2   10
    4  d   2   10

What is the best possible to do so 什么是最好的可能

df['amount'] = df.groupby('opp')['amount'].transform(lambda g: g/g.size)

df
#  emp  opp amount
# 0  a    1      5
# 1  b    1      5
# 2  c    2     10
# 3  b    2     10
# 4  d    2     10

Or: 要么:

df['amount'] = df.groupby('opp')['amount'].apply(lambda g: g/g.size)

does similar thing. 做类似的事情。

You could try something like this: 你可以尝试这样的事情:

df2 = df.groupby('opp').amount.count()
df.loc[:, 'calculated'] = df.apply( lambda row: \
                                  row.amount / df2.ix[row.opp], axis=1)
df

Yields: 产量:

  emp  opp  amount  calculated
0   a    1      10           5
1   b    1      10           5
2   c    2      30          10
3   b    2      30          10
4   d    2      30          10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM