Filling dataframe column with calculated value from a different dataframe

Question

I have two dataframes. The first contains a series of ID numbers with two dates, and the second contains a series of ID numbers with a date for an event.

import pandas as pd
df1 = pd.DataFrame({'id': ['19403', '59208', '45290'], 'date_1': ['2021-01-04', '2021-01-05', '2021-05-08'], 'date_2': ['2021-03-07', '2021-01-12', '2021-06-15']})
df2 = pd.DataFrame({'id': ['19403', '19403', '19403', '59208', '59208', '45290', '45290', '45290', '45290', '45290', '45290', '45290'], 'date': ['2021-01-05', '2021-02-12', '2021-03-17', '2021-01-06', '2021-01-08', '2021-03-14', '2021-03-17', '2021-05-09', '2021-05-12', '2021-05-23', '2021-06-01', '2021-08-14' ]})

I'm trying to merge the two dataframes so that for every row in the first dataframe, it counts the number of events in the second where the date is between the two dates in the first and the ID number matches.

output = pd.DataFrame({'id': ['19403', '59208', '45290'], 'date_1': ['2021-01-04', '2021-01-05', '2021-05-08'], 'date_2': ['2021-03-07', '2021-01-12', '2021-06-15'], 'count': [2, 2, 4]})

I've been able to mock it up one row at a time (below), but I'm struggling to apply it to the whole dataframe.

count = df2[(df2['id'] == '19403') & (df2['date'].between('2021-01-04', '2021-03-07'))]['date'].count()

Any suggestions? I'm probably missing something obvious here.

Answer 1

(df1.merge(df2, on='id', how='outer')
    .query("(date_1 <= date <= date_2)")
    .groupby(['id', 'date_1', 'date_2'], as_index=False)
    .size())

Output:

    id      date_1      date_2    size
0   19403   2021-01-04  2021-03-07  2
1   45290   2021-05-08  2021-06-15  4
2   59208   2021-01-05  2021-01-12  2

Filling dataframe column with calculated value from a different dataframe

Question

1 answers

solution1
1 ACCPTED 2021-10-28 16:31:54

Filling dataframe column with calculated value from a different dataframe

Question

1 answers

solution1 1 ACCPTED 2021-10-28 16:31:54

solution1
1 ACCPTED 2021-10-28 16:31:54