I have a pandas dataframe with a structure similar to:
Application | Account | Application_Date
1 | 444444 | 10/01/2018
2 | 444444 | 09/01/2018
3 | 555555 | 10/01/2018
And a different dataframe with a structure like this:
Case | Account | Case_Date
1 | 444444 | 09/01/2018
2 | 444444 | 11/01/2018
3 | 444444 | 10/01/2018
4 | 555555 | 07/01/2018
I want to check if the Account in the first dataframe exists in the second dataframe only if the Case_date is greater than or equal to the Application_Date, and get the output in a column in the first dataframe, as well as the cases numbers, like:
Application | Account | Application_Date | Case_Exists | Case_Number
1 | 444444 | 10/01/2018 | Y | 2, 3
2 | 444444 | 09/01/2018 | Y | 1, 2, 3
3 | 555555 | 10/01/2018 | N |
Could you please advise?
Thank you!
It's a bit of a convoluted solution, but it gets you there:
Application
and Account
, and get unique cases Y
to the non-null values (where cases were found): >>> df1
Application Account Application_Date
0 1 444444 10/01/2018
1 2 444444 09/01/2018
2 3 555555 10/01/2018
>>> df2
Case Account Case_Date
0 1 444444 09/01/2018
1 2 444444 11/01/2018
2 3 444444 10/01/2018
3 4 555555 07/01/2018
# set to datetime
df1['Application_Date'] = pd.to_datetime(df1['Application_Date'])
df2['Case_Date'] = pd.to_datetime(df2['Case_Date'])
# first merge
merged = df2.merge(df1)
# loc and groupby
cases = (merged.loc[merged['Case_Date'] >= merged['Application_Date']]
.groupby(['Account','Application'])['Case']
.unique())
# merge back
final = (cases.to_frame('Case_Number').merge(df1,left_index=True,
right_on=['Account', 'Application'],
how='outer')
# Following line is just to re-adjust column order
[['Application','Account','Application_Date','Case_Number']])
# assign Y and N
final['Case_Exists'] = final.Case_Number.notnull().map({True:'Y',False:'N'})
>>> final
Application Account Application_Date Case_Number Case_Exists
0 1 444444 2018-10-01 [2, 3] Y
1 2 444444 2018-09-01 [1, 2, 3] Y
2 3 555555 2018-10-01 NaN N
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.