简体   繁体   中英

replace values in the row of one dataframe based on the values in columns of another dataframe pandas

I have following dataframe df1

sports_id   school_id      time          activity_name
   1             2      09:00-11:00       soccer match
   3             1      08:00-09:00       soccer practice
   5             2      08:00-11:00        baseball

And following dataframe df2 containing student id and each date of a month december

student_id  sports_id   school_id   12-01-2018  12-02-2018  12-03-2018    12-04-2018
 0001          5           2        08:00-11:00    Rest     08:00-11:00   08:00-09:00
 0002          3           1        08:00-09:00    Rest     08:00-09:00   08:00-09:00
 0003          1           2        09:00-11:00    Rest     09:00-11:00   09:00-10:00

Based on sports_id,school_id and time in df1,I want to map activity_name to each student in df2 to get the following dataframe.If there is no match keep the existing value in the dataframe.The resultant dataframe would be following

student_id  sports_id   school_id   12-01-2018     12-02-2018    12-03-2018    12-04-2018
 0001           5          2        baseball            Rest    baseball       08:00-09:00
 0002           3          1       soccer practice      Rest    soccer practice    soccer practice
 0003           1          2        baseball            Rest    baseball        09:00-10:00

just to clarify, if sports_id=5,school_id=2 and time is 08:00-11:00,then replace the value 08:00-11:00 in df2 rows with"baseball"(as in df1) Since the combination sports_id=5,school_id=2 and time= 08:00-09:00 does not exist in df1,keep the time 08:00-09:00 as it is in df2 for the date 12-04-2018

In short,sports_id,school_id and time are the 3 keys and activity_name is the value corresponding to those 3 keys.

I am trying to do something like this

df2.applymap(df1.set_index(['sports_id','school_id','time'])['activity_name'])

but it's not working.

A length solution, change the shape of df2, merge and reshape again.

new_df = df2.set_index(['student_id','sports_id','school_id']).stack().reset_index(name = 'time').merge(df1, how = 'outer')
new_df.activity_name.fillna(new_df.time, inplace=True)
new_df = new_df.drop('time', 1).set_index(['student_id','sports_id','school_id', 'level_3']).activity_name.unstack().reset_index()
new_df.columns.name = None

    student_id  sports_id   school_id   12-01-2018      12-02-2018  12-03-2018      12-04-2018
0   1           5           2           baseball        Rest        baseball        08:00-09:00
1   2           3           1           soccer practice Rest        soccer practice soccer practice
2   3           1           2           soccer match    Rest        soccer match    09:00-10:00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM