简体   繁体   English

Pandas 根据另一列更改现有列值

[英]Pandas change existing column values based on another column

I have a pandas dataframe df as below:我有一个熊猫数据框 df 如下:

       A    B
0   70.0   20
1    NaN   20
2   28.0  100
3   75.0  120
4   56.0   30
5   84.0   90
6    NaN  100
7   19.0   10
8   93.0   80
9   94.0   70
10  72.0   20

I am trying to change the values of A as an average value based on B's A. For instance, for B = 20, I would like all A values to be an average of 70 and 72 ignoring NaN.我试图将 A 的值更改为基于 B 的 A 的平均值。例如,对于 B = 20,我希望所有 A 值的平均值为 70 和 72,忽略 NaN。 What is the best way possible please?请问最好的方法是什么? I am thinking along the groupby lines as in…我正在按照 groupby 的思路思考……

df['AA']=df.groupby('B')['A'].transform(lambda s: s=s.mean())

That did not help.那没有帮助。

IIUC IIUC

created column A2, just so as to have a reference of what 'A' was.创建列 A2,以便参考“A”是什么。 you can always updated back the column 'A'您可以随时更新“A”列

df['AA']=df[~df['A'].isnull()].groupby('B')['A'].transform('mean')
df
       A      B       AA
0   70.0     20     71.0
1    NaN     20      NaN
2   28.0    100     28.0
3   75.0    120     75.0
4   56.0     30     56.0
5   84.0     90     84.0
6    NaN    100      NaN
7   19.0     10     19.0
8   93.0     80     93.0
9   94.0     70     94.0
10  72.0     20     71.0

mean by default ignores NaN s... so the simplest method would just be: mean是默认忽略NaN s...所以最简单的方法就是:

df['AA'] = df.groupby('B').transform('mean')

Output:输出:

       A    B    AA
0   70.0   20  71.0
1    NaN   20  71.0
2   28.0  100  28.0
3   75.0  120  75.0
4   56.0   30  56.0
5   84.0   90  84.0
6    NaN  100  28.0
7   19.0   10  19.0
8   93.0   80  93.0
9   94.0   70  94.0
10  72.0   20  71.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM