在其他 ZBA834BA059A9A9A379459C112175EB88E4Z 的日期范围内为每个日期创建一个 DataFrame

Question

Below is script for a simplified version of the df in question:以下是相关 df 的简化版本的脚本：

plan_dates=pd.DataFrame({'id':[1,2,3,4,5],
                         'start_date':['2021-01-01','2021-01-01','2021-01-03','2021-01-04','2021-01-05'],
                         'end_date':  ['2021-01-04','2021-01-03','2021-01-03','2021-01-06','2021-01-08']})

plan_dates

    id  start_date  end_date
0   1   2021-01-01  2021-01-04
1   2   2021-01-01  2021-01-03
2   3   2021-01-03  2021-01-03
3   4   2021-01-04  2021-01-06
4   5   2021-01-05  2021-01-08

I would like to create a new DataFrame with a row for each day where the plan is active , for each id .我想为每个id创建一个新的 DataFrame ，其中计划有效的每一天都有一行。

INTENDED DF:预期的DF：

    id  active_days
0   1   2021-01-01
1   1   2021-01-02
2   1   2021-01-03
3   1   2021-01-04
4   2   2021-01-01
5   2   2021-01-02
6   2   2021-01-03
7   3   2021-01-03
8   4   2021-01-04
9   4   2021-01-05
10  4   2021-01-06
11  5   2021-01-05
12  5   2021-01-06
13  5   2021-01-07
14  5   2021-01-08

Any help would be greatly appreciated.任何帮助将不胜感激。

Answer 1

Use:利用：

#first part is same like https://stackoverflow.com/a/66869805/2901002
plan_dates['start_date'] = pd.to_datetime(plan_dates['start_date'])
plan_dates['end_date'] = pd.to_datetime(plan_dates['end_date']) + pd.Timedelta(1, unit='d')

s = plan_dates['end_date'].sub(plan_dates['start_date']).dt.days
df = plan_dates.loc[plan_dates.index.repeat(s)].copy()
counter = df.groupby(level=0).cumcount()
df['start_date'] = df['start_date'].add(pd.to_timedelta(counter, unit='d'))

Then remove end_date column, rename and create default index:然后删除end_date列， rename并创建默认索引：

df = (df.drop('end_date', axis=1)
        .rename(columns={'start_date':'active_days'})
        .reset_index(drop=True))
print (df)
    id active_days
0    1  2021-01-01
1    1  2021-01-02
2    1  2021-01-03
3    1  2021-01-04
4    2  2021-01-01
5    2  2021-01-02
6    2  2021-01-03
7    3  2021-01-03
8    4  2021-01-04
9    4  2021-01-05
10   4  2021-01-06
11   5  2021-01-05
12   5  2021-01-06
13   5  2021-01-07
14   5  2021-01-08

在其他 ZBA834BA059A9A9A379459C112175EB88E4Z 的日期范围内为每个日期创建一个 DataFrame

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-03-30 11:42:50

在其他 ZBA834BA059A9A9A379459C112175EB88E4Z 的日期范围内为每个日期创建一个 DataFrame

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-03-30 11:42:50

解决方案1
1 已采纳 2021-03-30 11:42:50