简体   繁体   中英

How to expand a DataFrame with extra rows based on a set time interval between a start date and an end date?

Given the following table:

df = pd.DataFrame({'pers_no': [1,1,2], 'start_date': ['2000-03-01','2000-06-01', '2001-04-01'], 'end_date': ['2000-05-01','2000-07-01', '2001-06-01'], 'value': [199,219,249]})

pers_no start_date  end_date    value
0   1   2000-03-01  2000-05-01  199
1   1   2000-06-01  2000-07-01  219
2   2   2001-04-01  2001-06-01  249

How to expand the DataFrame to get extra rows for eg each month between start date and end date? The result should look like this:


pers_no date        value
0   1   2000-03-01  199
1   1   2000-04-01  199
2   1   2000-05-01  199
3   1   2000-06-01  219
4   1   2000-07-01  219
5   2   2001-04-01  249
6   2   2001-05-01  249
7   2   2001-06-01  249

You can make new column with date_range and explode the data like this:

def get_dt_range(dt):
    return pd.date_range(dt['start_date'], dt['end_date']+pd.offsets.MonthEnd(), freq='MS')

df['date'] = df[['start_date','end_date']].apply(get_dt_range, axis=1)
df.explode('date') [['pers_no', 'date', 'value']]

Output:

   pers_no       date  value
0        1 2000-03-01    199
0        1 2000-04-01    199
0        1 2000-05-01    199
1        1 2000-06-01    219
1        1 2000-07-01    219
2        2 2001-04-01    249
2        2 2001-05-01    249
2        2 2001-06-01    249

You can do this:

pd.concat([pd.DataFrame({'Date': pd.date_range(row.start_date, row.end_date, freq='d'),
               'value': row.value,
               'pers_no': row.pers_no}, columns=['Date', 'value','pers_no']) 
           for i, row in df.iterrows()], ignore_index=True)

which gives:

          Date  value  pers_no
0   2000-03-01    199        1
1   2000-03-02    199        1
2   2000-03-03    199        1
3   2000-03-04    199        1
4   2000-03-05    199        1
..         ...    ...      ...
150 2001-05-28    249        2
151 2001-05-29    249        2
152 2001-05-30    249        2
153 2001-05-31    249        2
154 2001-06-01    249        2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM