如果满足条件，则在 Pandas 数据框中进行数学运算

Question

I am new to pandas.我是熊猫的新手。

My DataFrame looks like this:我的 DataFrame 看起来像这样：

    a1  b1   c1  d1  e1 
A   10  10   1   2   0   
B   20  20   2   1   1
C   30  30   3   1   0
D   40  40   4   1   1
E   40  40   4   1   2
F   40  40   4   1   1

I want to do math operations only for values where e1 is the same.我只想对e1相同的值进行数学运算。

For example: ( a1A + a1C ) / ( c1A + c1C ) for values where C is the same.例如：( a1A + a1C ) / ( c1A + c1C ) 对于C相同的值。 So I would end up with a dataframe like this:所以我最终会得到一个这样的数据框：

    a1  b1   c1  d1  e1     result
A   10  10   1   2   0      (a1A + a1C) / ( c1A + c1C )
B   20  20   2   1   1      (a1B + a1D+ a1F) / ( c1B + c1D+ c1F )
C   30  30   3   1   0      Do not calculate it because its already calculated
D   40  40   4   1   1      Do not calculate it because its already calculated
E   40  40   4   1   2      (a1E / c1E)
F   40  40   4   1   1      Do not calculate it because its already calculatedcalculated

I do not know how could I apply a condition to the calculations and how would I omit calculations if it has already been calculated.我不知道如何将条件应用于计算，如果已经计算过，我将如何省略计算。

Thank you for your suggestions.谢谢你的建议。

Answer 1

First aggregate sum per groups, then remove duplicates by Series.drop_duplicates and last use Series.map by difference:首先聚合每个组的总和，然后通过Series.drop_duplicates删除重复Series.drop_duplicates ，最后通过差异使用Series.map ：

s = df.groupby('e1')['a1','c1'].sum() 

df['new'] = df['e1'].drop_duplicates().map(s.a1 / s.c1)
print (df)
   a1  b1  c1  d1  e1   new
A  10  10   1   2   0  10.0
B  20  20   2   1   1  10.0
C  30  30   3   1   0   NaN
D  40  40   4   1   1   NaN
E  40  40   4   1   2  10.0
F  40  40   4   1   1   NaN

Also I think in pandas obviously map by unique values is not necessary, obviously is used GroupBy.transform and added new column filled by mapped data:另外我认为在熊猫中显然不需要按唯一值映射，显然使用了GroupBy.transform并添加了由映射数据填充的新列：

df2 = df.groupby('e1')['a1','c1'].transform('sum')
print (df2)
    a1  c1
A   40   4
B  100  10
C   40   4
D  100  10
E   40   4
F  100  10

df['new'] = df2.a1 / df2.c1
print (df)
   a1  b1  c1  d1  e1   new
A  10  10   1   2   0  10.0
B  20  20   2   1   1  10.0
C  30  30   3   1   0  10.0
D  40  40   4   1   1  10.0
E  40  40   4   1   2  10.0
F  40  40   4   1   1  10.0

如果满足条件，则在 Pandas 数据框中进行数学运算

问题描述

1 个解决方案

解决方案1
3 已采纳 2019-12-12 14:10:28

如果满足条件，则在 Pandas 数据框中进行数学运算

问题描述

1 个解决方案

解决方案1 3 已采纳 2019-12-12 14:10:28

解决方案1
3 已采纳 2019-12-12 14:10:28