简体   繁体   English

根据同一行中另一列的值填充缺失值

[英]Fill missing value based on value from another column in the same row

I have a DataFrame looks like this 我有一个DataFrame看起来像这样

ColA | ColB | ColC | ColD |
-----|------|------|------|
100  |   A  |  X1  |  NaN |
200  |   B  |  X2  |  AAA |
300  |   C  |  X3  |  NaN |

I want to fill the missing value on ColD based on value on ColA . 我想基于可乐值填充遗漏值。 The result I need is like: 我需要的结果是:

if value in ColA = 100 then value in ColD = "BBB"
if value in ColA = 300 then value in ColD = "CCC"

ColA | ColB | ColC | ColD |
-----|------|------|------|
100  |   A  |  X1  |  BBB |
200  |   B  |  X2  |  AAA |
300  |   C  |  X3  |  CCC |

You can use combine_first or fillna : 你可以使用combine_firstfillna

df.ColD = df.ColD.combine_first(df.ColA)
print (df)
   ColA ColB ColC ColD
0   100    A   X1  100
1   200    B   X2  AAA
2   300    C   X3  300

Or: 要么:

df.ColD = df.ColD.fillna(df.ColA)
print (df)
   ColA ColB ColC ColD
0   100    A   X1  100
1   200    B   X2  AAA
2   300    C   X3  300

EDIT: First use map for Series s and then combine_first or fillna by this Series : 编辑:首先使用Series s map ,然后使用combine_firstfillna Series

d = {100: "BBB", 300:'CCC'}
s = df.ColA.map(d)
print (s)
0    BBB
1    NaN
2    CCC
Name: ColA, dtype: object

df.ColD = df.ColD.combine_first(s)
print (df)
   ColA ColB ColC ColD
0   100    A   X1  BBB
1   200    B   X2  AAA
2   300    C   X3  CCC

It replace only NaN : 它只取代NaN

print (df)
   ColA ColB ColC ColD
0   100    A   X1  EEE <- changed value to EEE
1   200    B   X2  AAA
2   300    C   X3  NaN

d = {100: "BBB", 300:'CCC'}
s = df.ColA.map(d)
df.ColD = df.ColD.combine_first(s)
print (df)
   ColA ColB ColC ColD
0   100    A   X1  EEE
1   200    B   X2  AAA
2   300    C   X3  CCC

Define a mapping function: 定义映射函数:

def my_map_func(x):
    return "BBB" if x==100 else "CCC"

Right now, df looks like: 现在,df看起来像:

ColA | ColB | ColC | ColD
-----|------|------|-----
100  |    A |   X1 |  NaN
200  |    B |   X2 |  AAA
300  |    C |   X3 |  NaN

Select the rows that have NaN, and fill it with mapped value obtained from column ColA 选择具有NaN的行,并使用从ColA列获取的映射值填充它

df.ix[df.ColD.isnull(), 'ColD'] = df.ix[df.ColD.isnull(), 'ColA'].apply(my_map_func)

Here, we are basically selecting only those rows for which ColD is NaN by indexing based on a boolean series and selecting the column, ColA we are interested in. In simple language, df.ix[selected_rows, selected_columns] . 在这里,我们基本上只选择那些ColD为NaN的行,通过基于布尔序列的索引并选择我们感兴趣的ColA列。用简单的语言, df.ix [selected_rows,selected_columns]

Now, dataframe df looks like: 现在,dataframe df看起来像:

ColA | ColB | ColC | ColD
-----|------|------|-----
100  |    A |   X1 |  BBB
200  |    B |   X2 |  AAA
300  |    C |   X3 |  CCC

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据Pandas中第二列的条件,用另一行的同一列的值填充特定行的列中的值 - Fill values in a column of a particular row with the value of same column from another row based on a condition on second column in Pandas 根据一列将缺失值填充到另一列 - fill missing value based on one column to another 如何根据同一行中另一列中的值向前填充列值 - how to forward fill a column values based on the value in another column in same row Python pandas 根据另一列的条件填充缺失值(NaN) - Python pandas fill missing value (NaN) based on condition of another column 如何根据另一列值填充空索引或空行? - How to fill empty index or empty row based on another column value? 根据熊猫中的行匹配,用另一个DataFrame中的值有条件地填充列 - Conditionally fill column with value from another DataFrame based on row match in Pandas Label 基于另一列(同一行)的值的列 pandas dataframe - Label a column based on the value of another column (same row) in pandas dataframe 用另一个列值填充缺失的日期 - Fill missing dates with another column value 使用来自同一行但不同列的值填充字典 - Fill dictionary with value from the same row, but different column Pandas:同列不同行如何填写值 - Pandas:How to fill in value from the same column but different row
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM