[英]How to replace values in a column if another column is a NaN?
So this should be the easiest thing on earth. 所以这应该是世界上最简单的事情。 Pseudocode: 伪代码:
Replace column C with NaN if column E is NaN
I know I can do this by pulling out all dataframe rows where column E is NaN, replacing all of Column C, and then merging that on the original dataset, but that seems like a lot of work for a simple operation. 我知道我可以通过拉出列E为NaN的所有数据帧行,替换所有列C,然后将其合并到原始数据集上来做到这一点,但这对于简单的操作来说似乎很多工作。 Why doesn't this work: 为什么这不起作用:
Sample data: 样本数据:
dfz = pd.DataFrame({'A' : [1,0,0,1,0,0],
'B' : [1,0,0,1,0,1],
'C' : [1,0,0,1,3,1],
'D' : [1,0,0,1,0,0],
'E' : [22.0,15.0,None,10.,None,557.0]})
Replace Function: 替换功能:
def NaNfunc(dfz):
if dfz['E'] == None:
return None
else:
return dfz['C']
dfz['C'] = dfz.apply(NaNfunc, axis=1)
And how to do this in one line? 如何在一条线上做到这一点?
Use np.where
: 使用np.where
:
In [34]:
dfz['C'] = np.where(dfz['E'].isnull(), dfz['E'], dfz['C'])
dfz
Out[34]:
A B C D E
0 1 1 1 1 22
1 0 0 0 0 15
2 0 0 NaN 0 NaN
3 1 1 1 1 10
4 0 0 NaN 0 NaN
5 0 1 1 0 557
Or simply mask the df: 或者简单地掩盖df:
In [38]:
dfz.loc[dfz['E'].isnull(), 'C'] = dfz['E']
dfz
Out[38]:
A B C D E
0 1 1 1 1 22
1 0 0 0 0 15
2 0 0 NaN 0 NaN
3 1 1 1 1 10
4 0 0 NaN 0 NaN
5 0 1 1 0 557
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.