简体   繁体   English

使用其他已知列值的Pandas Dataframe fillna()

[英]Pandas Dataframe fillna() using other known column values

Given the following sample df : 给定以下样本df

   Other1  Other2     Name Value
0       0       1  Johnson     C
1       0       0  Johnson   NaN
2       1       1    Smith     R
3       1       1    Smith   NaN
4       0       1  Jackson     X
5       1       1  Jackson   NaN
6       1       1  Jackson   NaN

I want to be able to fill the NaN values with the df['Value'] value associated with the given name in that row. 我希望能够用与该行中给定名称关联的df['Value']值填充NaN值。 My desired outcome is the following, which I know can be achieved like so: 我期望的结果如下,我知道可以这样实现:

df['Value'] = df['Value'].fillna(method='ffill')

   Other1  Other2     Name Value
0       0       1  Johnson     C
1       0       0  Johnson     C
2       1       1    Smith     R
3       1       1    Smith     R
4       0       1  Jackson     X
5       1       1  Jackson     X
6       1       1  Jackson     X

However, this solution will not achieve the desired result if the names are not followed by one another in order. 但是,如果名称后面没有顺序排列,则此解决方案将无法获得理想的结果。 I also cannot sort by df['Name'] , as the order is important. 我也不能按df['Name']排序,因为顺序很重要。 Is there an efficient means of simply filling a given NaN value by it's associated name value and assigning it to that? 是否有一种有效的方法,可以简单地通过关联的名称值填充给定的NaN值并将其分配给该值?

It's also important to note that a given Name will always only have a single value associated with it. 同样重要的是要注意,给定的名称将始终仅具有与之关联的单个值。 Thank you in advance. 先感谢您。

You should use groupby and transform : 您应该使用groupby并进行transform

df['Value'] = df.groupby('Name')['Value'].transform('first')
df

   Other1  Other2     Name Value
0       0       1  Johnson     C
1       0       0  Johnson     C
2       1       1    Smith     R
3       1       1    Smith     R
4       0       1  Jackson     X
5       1       1  Jackson     X
6       1       1  Jackson     X

Peter's answer is not correct because the first valid value may not always be the first in the group, in which case ffill will pollute the next group with the previous group's value. Peter的答案是不正确的,因为第一个有效值可能并不总是组中的第一个有效值,在这种情况下, ffill将污染前一组值的下一个组。

ALollz's answer is fine, but dropna incurs some degree of overhead. ALollz的回答很好,但是dropna会产生一定程度的开销。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM