简体   繁体   English

如何根据 Pandas 中另一列的值填充一列的缺失值?

[英]How to fill one column's missing values conditioning on another column's value in Pandas?

I have a dataframe looks like below:我有一个 dataframe 如下所示:

import numpy as np
import pandas as pd
d = {'col1': [np.nan, 19, 32, np.nan, 54, 67], 'col2': [0, 1, 0, 1, 1, 1]}
df = pd.DataFrame(d)

I want to fill the missing values in "col1" based on the values of "col2".我想根据“col2”的值填充“col1”中的缺失值。 To be specific: I want to fill the missing values in "col1" with 0 if "col2" is 0, else leave the "col1" as it is.具体来说:如果“col2”为0,我想用0填充“col1”中的缺失值,否则保持“col1”不变。 In this case, my output should look like:在这种情况下,我的 output 应该如下所示:

d_updated = {'col1': [0, 19, 32, np.nan, 54, 67], 'col2': [0, 1, 0, 1, 1, 1]}
df_updated = pd.DataFrame(d_updated)

To have the above output, I try to get the index which "col2" have values equal to 0 and use fillna():要获得上述 output,我尝试获取“col2”值等于 0 的索引并使用 fillna():

ix = list(df[df["col2"] == 0].index)
df["col2"].loc[ix].fillna(0, inplace = True)

However, my approach doesn't work and I don't know why.但是,我的方法不起作用,我不知道为什么。 Thanks ahead.提前谢谢。

Try, using loc with boolean indexing:尝试使用loc和 boolean 索引:

df.loc[(df['col1'].isna()) & (df['col2'] == 0), 'col1'] = df['col2']

Output: Output:

   col1  col2
0   0.0     0
1  19.0     1
2  32.0     0
3   NaN     1
4  54.0     1
5  67.0     1
m=(df.col2==0 )&(df.col1.isna())#boolean select using loc

Then any of the following can do那么以下任何一项都可以做到

df.loc[m,'col1']=df.loc[m,'col1'].fillna(0, inplace=True)

or
df.loc[m,'col1'] = df.loc[m,'col1'].replace('nan', np.nan).fillna(0)

在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果其他两个列在Pandas中具有匹配的值,如何用另一个数据框的值填充空列的值? - How to fill empty column values with another dataframe's value if two other columns have matching values in Pandas? 如何使用熊猫中的for循环根据另一列的条件填充一列中的缺失值? - How to fill in missing values in one column based on a condition form another column using for loops in pandas? 使用另一个 pandas 数据框的列填充 na 值,但使用列索引,而不是名称 - Fill na values in one pandas dataframe's column using another, but using column indices, not names 如何将一列除以另一列,其中一个数据帧的列值对应于 Python Pandas 中另一个数据帧的列值? - How to divide one column by another where one dataframe's column value corresponds to another dataframe's column's value in Python Pandas? 用另一列的值填充组中最后一项的一列 - Fill one column of last item in group with another column's value 如何基于另一列填充 Pandas 中的数字缺失值 - How to Fill Numeric missing Values In Pandas Based On Another Column Pandas 根据 DataFrame 中的另一列填充 NA 的增量值 - Pandas fill incremental values for NA's according to another column in the DataFrame 用 pandas 重复用另一列的一部分填充一列的值 - fill one columns' value with a part of other's column repeatedly with pandas 如果 pandas 列值等于另一列的值,则更新它们 - Update pandas column values if they are equal to another column's value 根据一列将缺失值填充到另一列 - fill missing value based on one column to another
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM