[英]Fill values in a column of a particular row with the value of same column from another row based on a condition on second column in Pandas
[英]Fill missing value based on value from another column in the same row
我有一个DataFrame看起来像这样
ColA | ColB | ColC | ColD |
-----|------|------|------|
100 | A | X1 | NaN |
200 | B | X2 | AAA |
300 | C | X3 | NaN |
我想基于可乐值填充冷遗漏值。 我需要的结果是:
if value in ColA = 100 then value in ColD = "BBB"
if value in ColA = 300 then value in ColD = "CCC"
ColA | ColB | ColC | ColD |
-----|------|------|------|
100 | A | X1 | BBB |
200 | B | X2 | AAA |
300 | C | X3 | CCC |
你可以使用combine_first
或fillna
:
df.ColD = df.ColD.combine_first(df.ColA)
print (df)
ColA ColB ColC ColD
0 100 A X1 100
1 200 B X2 AAA
2 300 C X3 300
要么:
df.ColD = df.ColD.fillna(df.ColA)
print (df)
ColA ColB ColC ColD
0 100 A X1 100
1 200 B X2 AAA
2 300 C X3 300
编辑:首先使用Series
s
map
,然后使用combine_first
或fillna
Series
:
d = {100: "BBB", 300:'CCC'}
s = df.ColA.map(d)
print (s)
0 BBB
1 NaN
2 CCC
Name: ColA, dtype: object
df.ColD = df.ColD.combine_first(s)
print (df)
ColA ColB ColC ColD
0 100 A X1 BBB
1 200 B X2 AAA
2 300 C X3 CCC
它只取代NaN
:
print (df)
ColA ColB ColC ColD
0 100 A X1 EEE <- changed value to EEE
1 200 B X2 AAA
2 300 C X3 NaN
d = {100: "BBB", 300:'CCC'}
s = df.ColA.map(d)
df.ColD = df.ColD.combine_first(s)
print (df)
ColA ColB ColC ColD
0 100 A X1 EEE
1 200 B X2 AAA
2 300 C X3 CCC
定义映射函数:
def my_map_func(x):
return "BBB" if x==100 else "CCC"
现在,df看起来像:
ColA | ColB | ColC | ColD
-----|------|------|-----
100 | A | X1 | NaN
200 | B | X2 | AAA
300 | C | X3 | NaN
选择具有NaN的行,并使用从ColA列获取的映射值填充它
df.ix[df.ColD.isnull(), 'ColD'] = df.ix[df.ColD.isnull(), 'ColA'].apply(my_map_func)
在这里,我们基本上只选择那些ColD为NaN的行,通过基于布尔序列的索引并选择我们感兴趣的ColA列。用简单的语言, df.ix [selected_rows,selected_columns] 。
现在,dataframe df看起来像:
ColA | ColB | ColC | ColD
-----|------|------|-----
100 | A | X1 | BBB
200 | B | X2 | AAA
300 | C | X3 | CCC
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.