[英]How to fill one column's missing values conditioning on another column's value in Pandas?
I have a dataframe looks like below:我有一个 dataframe 如下所示:
import numpy as np
import pandas as pd
d = {'col1': [np.nan, 19, 32, np.nan, 54, 67], 'col2': [0, 1, 0, 1, 1, 1]}
df = pd.DataFrame(d)
I want to fill the missing values in "col1" based on the values of "col2".我想根据“col2”的值填充“col1”中的缺失值。 To be specific: I want to fill the missing values in "col1" with 0 if "col2" is 0, else leave the "col1" as it is.具体来说:如果“col2”为0,我想用0填充“col1”中的缺失值,否则保持“col1”不变。 In this case, my output should look like:在这种情况下,我的 output 应该如下所示:
d_updated = {'col1': [0, 19, 32, np.nan, 54, 67], 'col2': [0, 1, 0, 1, 1, 1]}
df_updated = pd.DataFrame(d_updated)
To have the above output, I try to get the index which "col2" have values equal to 0 and use fillna():要获得上述 output,我尝试获取“col2”值等于 0 的索引并使用 fillna():
ix = list(df[df["col2"] == 0].index)
df["col2"].loc[ix].fillna(0, inplace = True)
However, my approach doesn't work and I don't know why.但是,我的方法不起作用,我不知道为什么。 Thanks ahead.提前谢谢。
Try, using loc
with boolean indexing:尝试使用loc
和 boolean 索引:
df.loc[(df['col1'].isna()) & (df['col2'] == 0), 'col1'] = df['col2']
Output: Output:
col1 col2
0 0.0 0
1 19.0 1
2 32.0 0
3 NaN 1
4 54.0 1
5 67.0 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.