[英]Pandas change existing column values based on another column
I have a pandas dataframe df as below:我有一个熊猫数据框 df 如下:
A B
0 70.0 20
1 NaN 20
2 28.0 100
3 75.0 120
4 56.0 30
5 84.0 90
6 NaN 100
7 19.0 10
8 93.0 80
9 94.0 70
10 72.0 20
I am trying to change the values of A as an average value based on B's A. For instance, for B = 20, I would like all A values to be an average of 70 and 72 ignoring NaN.我试图将 A 的值更改为基于 B 的 A 的平均值。例如,对于 B = 20,我希望所有 A 值的平均值为 70 和 72,忽略 NaN。 What is the best way possible please?
请问最好的方法是什么? I am thinking along the groupby lines as in…
我正在按照 groupby 的思路思考……
df['AA']=df.groupby('B')['A'].transform(lambda s: s=s.mean())
That did not help.那没有帮助。
IIUC IIUC
created column A2, just so as to have a reference of what 'A' was.创建列 A2,以便参考“A”是什么。 you can always updated back the column 'A'
您可以随时更新“A”列
df['AA']=df[~df['A'].isnull()].groupby('B')['A'].transform('mean')
df
A B AA
0 70.0 20 71.0
1 NaN 20 NaN
2 28.0 100 28.0
3 75.0 120 75.0
4 56.0 30 56.0
5 84.0 90 84.0
6 NaN 100 NaN
7 19.0 10 19.0
8 93.0 80 93.0
9 94.0 70 94.0
10 72.0 20 71.0
mean
by default ignores NaN
s... so the simplest method would just be: mean
是默认忽略NaN
s...所以最简单的方法就是:
df['AA'] = df.groupby('B').transform('mean')
Output:输出:
A B AA
0 70.0 20 71.0
1 NaN 20 71.0
2 28.0 100 28.0
3 75.0 120 75.0
4 56.0 30 56.0
5 84.0 90 84.0
6 NaN 100 28.0
7 19.0 10 19.0
8 93.0 80 93.0
9 94.0 70 94.0
10 72.0 20 71.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.