There are 2 dataframes df
and events
which look like below:
import pandas as pd
df = pd.DataFrame({'Place':['university','residential','hospital','university','residential','hospital'],
'Date':['2017-01-01','2017-01-01','2017-01-01','2017-01-02','2017-01-02','2017-01-02'],
'Event':['None','None','None','None','None','None']
})
events = pd.DataFrame({'Place':['university','residential','hospital'], 'Start_Date':['2017-01-01','2017-01-01','2017-01-01'],
'End_Date':['2017-02-26','2017-01-02','2017-01-02'],
'Event':['UniHolidays','PublicHoliday','PublicHoliday']})
#Convert to datetime
events.Start_Date = pd.to_datetime(events.Start_Date.astype(str), format='%Y-%m-%d')
events.End_Date = pd.to_datetime(events.End_Date.astype(str), format='%Y-%m-%d')
df.Date = pd.to_datetime(df.Date.astype(str), format='%Y-%m-%d')
df has 1 record for every date in 2017 for each place
df:
Date Place Event
2017-01-01 university None
2017-01-01 residential None
2017-01-01 hospital None
2017-01-02 university None
2017-01-02 residential None
2017-01-02 hospital None
The second dataframe contains events for these places but with a date range
events:
Place Start_Date End_Date Event
a 2017-01-01 2017-02-26 UniHoliday
b 2017-01-01 2017-01-02 PublicHoliday
c 2017-01-01 2017-01-02 PublicHoliday
The task is to update df
using events
such that
if df.Place
= events.Place
and df.Date
is in range ( events.Start_Date, events.End_Date
) then df.Event
should be updated with the corresponding event.Event
The expected output is:
Date Place Event
2017-01-01 university UniHoliday
2017-01-01 residential PublicHoliday
2017-01-01 hospital PublicHoliday
2017-01-02 university UniHoliday
2017-01-02 residential PublicHoliday
2017-01-02 hospital PublicHoliday
There are no overlapping events, every place has a unique record of events
so far I have been thinking along the lines of: Populate column in data frame based on a range found in another dataframe ,But can't get my head around it. Any help is appreciated. Thank you!
Solution 1:
Add:
df['Event']=events['Event'].tolist()*2
To the end of the code.
Then now:
print(df)
Is:
Date Event Place
0 2017-01-01 UniHolidays university
1 2017-01-01 PublicHoliday residential
2 2017-01-01 PublicHoliday hospital
3 2017-01-02 UniHolidays university
4 2017-01-02 PublicHoliday residential
5 2017-01-02 PublicHoliday hospital
----------------------------------------
Solution 2:
If want them to add in the right place do:
df=df.drop('Event',1)
df.insert(2,'Event',events['Event'].tolist()*2)
In the end of the code.
Then now:
print(df)
Outputs:
Date Place Event
0 2017-01-01 university UniHolidays
1 2017-01-01 residential PublicHoliday
2 2017-01-01 hospital PublicHoliday
3 2017-01-02 university UniHolidays
4 2017-01-02 residential PublicHoliday
5 2017-01-02 hospital PublicHoliday
Solution 1 + Solution 2 , will work,
But still best is to do singularly.
Use:
df=df.drop('Event',1)
df.insert(2,'Event',events['Event'].tolist()*(len(df['Event'])/len(events['Event'].tolist())))
In the end of the code.
Then now:
print(df)
Outputs:
Date Place Event
0 2017-01-01 university UniHolidays
1 2017-01-01 residential PublicHoliday
2 2017-01-01 hospital PublicHoliday
3 2017-01-02 university UniHolidays
4 2017-01-02 residential PublicHoliday
5 2017-01-02 hospital PublicHoliday
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.