GroupBy一列，对pandas中另一列分组记录进行自定义操作

Question

I wanted to apply a custom operation on a column by grouping the values on another column. 我想通过将值分组到另一列上来对列应用自定义操作。 Group by column to get the count, then divide the another column value with this count for all the grouped records. 按列分组以获取计数，然后将所有分组记录的另一列值除以此计数。

My Data Frame: 我的数据框架：

   emp opp amount
0  a   1   10
1  b   1   10
2  c   2   30
3  b   2   30
4  d   2   30

My scenario: 我的情景：

For opp=1, two emp's worked(a,b). 对于opp = 1，两个emp工作（a，b）。 So the amount should be shared like 10/2 =5 所以金额应该像10/2 = 5一样分享
For opp=2, two emp's worked(b,c,d). 对于opp = 2，两个emp工作（b，c，d）。 So the amount should be like 30/3 = 10 所以金额应该是30/3 = 10

Final Output DataFrame: 最终输出数据框架：

      emp opp amount
    0  a   1   5
    1  b   1   5
    2  c   2   10
    3  b   2   10
    4  d   2   10

What is the best possible to do so 什么是最好的可能

Answer 1

df['amount'] = df.groupby('opp')['amount'].transform(lambda g: g/g.size)

df
#  emp  opp amount
# 0  a    1      5
# 1  b    1      5
# 2  c    2     10
# 3  b    2     10
# 4  d    2     10

Or: 要么：

df['amount'] = df.groupby('opp')['amount'].apply(lambda g: g/g.size)

does similar thing. 做类似的事情。

Answer 2

You could try something like this: 你可以尝试这样的事情：

df2 = df.groupby('opp').amount.count()
df.loc[:, 'calculated'] = df.apply( lambda row: \
                                  row.amount / df2.ix[row.opp], axis=1)
df

Yields: 产量：

  emp  opp  amount  calculated
0   a    1      10           5
1   b    1      10           5
2   c    2      30          10
3   b    2      30          10
4   d    2      30          10

GroupBy一列，对pandas中另一列分组记录进行自定义操作

问题描述

2 个解决方案

解决方案1
5 已采纳 2016-08-10 15:13:06

解决方案2
3 2016-08-10 15:13:42

GroupBy一列，对pandas中另一列分组记录进行自定义操作

问题描述

2 个解决方案

解决方案1 5 已采纳 2016-08-10 15:13:06

解决方案2 3 2016-08-10 15:13:42

解决方案1
5 已采纳 2016-08-10 15:13:06

解决方案2
3 2016-08-10 15:13:42