简体   繁体   中英

How set values in pandas dataframe based on NaN values of another column?

I have dataframe named df with original shape (4361, 15) . Some of agefm column`s values are NaN. Just look:

> df[df.agefm.isnull() == True].agefm.shape
(2282,)

Then I create new column and set all its values to 0:

df['nevermarr'] = 0

So I would like to set nevermarr value to 1, then in that row agefm is Nan:

df[df.agefm.isnull() == True].nevermarr = 1

Nothing changed:

> df['nevermarr'].sum()
0

What am I doing wrong?

The best is use numpy.where :

df['nevermarr'] = np.where(df.agefm.isnull(), 1, 0)
print (df)
   agefm  nevermarr
0    NaN          1
1    5.0          0
2    6.0          0

Or use loc , ==True can be omitted:

df.loc[df.agefm.isnull(), 'nevermarr'] = 1

Or mask :

df['nevermarr'] = df.nevermarr.mask(df.agefm.isnull(), 1)
print (df)
   agefm  nevermarr
0    NaN          1
1    5.0          2
2    6.0          3

Sample:

import pandas as pd
import numpy as np

df = pd.DataFrame({'nevermarr':[7,2,3],
                   'agefm':[np.nan,5,6]})

print (df)
   agefm  nevermarr
0    NaN          7
1    5.0          2
2    6.0          3

df.loc[df.agefm.isnull(), 'nevermarr'] = 1
print (df)
   agefm  nevermarr
0    NaN          1
1    5.0          2
2    6.0          3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM