简体   繁体   中英

Python Dataframe: How to map a column value with adjacent columns value?

map a column value with adjacent columns value using pandas with python have a df with values

name          exm_date                         att_date    
tom       2019-03-05 11:48:03.166             2020-03-05 11:48:03.166 
mark      2018-03-05 11:48:03.166             2020-03-05 11:48:03.166     
matt      2020-08-05 11:48:03.166              NAT
rob       2020-06-05 11:48:03.166              NAT
chuck     2020-02-05 11:48:03.166              NAT
tom       2020-03-05 11:48:03.166              NAT
matt      2020-02-05 11:48:03.166             2020-03-05 11:48:03.166     
chuck     2020-06-05 11:48:03.166             2020-03-05 11:48:03.166                       

for values in att_date having NAT, should pick date from exm_date expected output:

name          exm_date                         att_date    
tom       2019-03-05 11:48:03.166             2020-03-05 11:48:03.166 
mark      2018-03-05 11:48:03.166             2020-03-05 11:48:03.166     
matt      2020-08-05 11:48:03.166             2020-08-05 11:48:03.166 
rob       2020-06-05 11:48:03.166             2020-06-05 11:48:03.166  
chuck     2020-02-05 11:48:03.166             2020-02-05 11:48:03.166
tom       2020-03-05 11:48:03.166             2020-03-05 11:48:03.166
matt      2020-02-05 11:48:03.166             2020-03-05 11:48:03.166     
chuck     2020-06-05 11:48:03.166             2020-03-05 11:48:03.166     

you can use .loc with a boolean to which evaluates to True if att_date is null

df.loc[df['att_date'].isna(),'att_date'] = df['exm_date']
print(df)
     name          exm_date                att_date
0    tom 2019-03-05 11:48:03.166 2020-03-05 11:48:03.166
1   mark 2018-03-05 11:48:03.166 2020-03-05 11:48:03.166
2   matt 2020-08-05 11:48:03.166 2020-08-05 11:48:03.166
3    rob 2020-06-05 11:48:03.166 2020-06-05 11:48:03.166
4  chuck 2020-02-05 11:48:03.166 2020-02-05 11:48:03.166
5    tom 2020-03-05 11:48:03.166 2020-03-05 11:48:03.166
6   matt 2020-02-05 11:48:03.166 2020-03-05 11:48:03.166
7  chuck 2020-06-05 11:48:03.166 2020-03-05 11:48:03.166

you can use fillna or combine_first

df['att_date'] = df['att_date'].fillna(df['exm_date'])
#or
df['att_date'] = df['att_date'].combine_first(df['exm_date'])

You could use the apply method to replace missing values:

df['att_date'] = df.apply(
    lambda row: row['exm_date'] if row['att_date'] == 'NAT' else row['att_date'],
    axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM