如何基于其他列的某些值替换列的nan值

Question

I've two columns, col1 refers to level of education and col2 to their job. 我有两列，col1指受教育程度，col2指他们的工作。 col2 have some nan values, so I want to replace this nan values based on the value of column 1. for example if col1='bachelor' then col2 must be ='teacher' if col1='high school' then col2='actor'.. and so on, I have 7 different values of col1. col2具有一些nan值，因此我想根据列1的值替换此nan值。例如，如果col1 ='bachelor'，则col2必须为='teacher'；如果col1 ='highschool'，则col2 ='actor '..依此类推，我有7个不同的col1值。

I've tried to create a function like this: 我试图创建一个像这样的函数：

def rep_nan(x):
    if x['col1']=='bachelor':
        x['col2']='teacher'
    elif x['col1']=='blabla':
        x['col2']='blabla'
    .....
    elif x['col1']='high school':
        x['col2']='actor'

then I applied to my dataset: 然后我将其应用于数据集：

df.apply(rep_nan,axis=1)

but I get as result a None column 但结果是无列

where is the error? 错误在哪里？ or how could I do this task? 或者我该怎么做？

Answer 1

You can make a dictionary here: 您可以在此处制作字典：

rep_nan = {
    'bachelor': 'tacher',
    'blabla': 'blabla',
    'high school': 'actor'
}

Then we can replace the nan values with: 然后我们可以将nan值替换为：

df.loc[df['col2'].isnull(), 'col2'] = df[df['col2'].isnull()]['col1'].replace(rep_nan)

For example: 例如：

>>> df
          col1   col2
0     bachelor   None
1     bachelor  clown
2       blabla   None
3  high school   None
>>> df.loc[df['col2'].isnull(), 'col2'] = df[df['col2'].isnull()]['col1'].replace(rep_nan)
>>> df
          col1    col2
0     bachelor  tacher
1     bachelor   clown
2       blabla  blabla
3  high school   actor

如何基于其他列的某些值替换列的nan值

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-09-01 10:34:43

如何基于其他列的某些值替换列的nan值

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-09-01 10:34:43

解决方案1
0 已采纳 2019-09-01 10:34:43