简体   繁体   English

如何在两个日期之间添加日期范围 - Python Pandas

[英]How add date_range between two dates - Python Pandas

I would like to treat the time overlap between some days.我想处理几天之间的时间重叠。 As you can see in my df, I have a begin on the date 2019-10-25 and the end at 2019-10-27:正如您在我的 df 中看到的,我的开始日期为 2019-10-25,结束日期为 2019-10-27:

begin                       end                          info
2019-10-25 10:39:58.352073  2019-10-25 10:40:06.266782   toto
2019-10-25 16:35:22.485574  2019-10-27 09:50:31.713179   tata <------ HERE
2019-10-27 09:50:31.713179  2019-10-27 09:50:31.713192   titi
2019-10-28 14:04:33.095633  2019-10-28 14:05:07.639344   tete

I would like to add as many time slots (date 00:00:00; date 23:59:59.9) as there are between these two dates and copy the data info , like these:我想在这两个日期之间添加尽可能多的时间段(日期 00:00:00;日期 23:59:59.9)并复制数据信息,如下所示:

2019-10-25 16:35:22.485574  2019-10-25 23:59:59.999999   tata
2019-10-26 00:00:00.000000  2019-10-26 23:59:59.999999   tata
2019-10-27 00:00:00.000000  2019-10-27 09:50:31.713179   tata
  • If date begin is different from end so => calculate the number of days如果开始日期与结束日期不同,那么 => 计算天数
  • keep the begin and add the new end 'date 23:59:59.9'保留开头并添加新的结尾“日期 23:59:59.9”
  • add new date_range corresponding of number days添加新的 date_range 对应的天数
  • take the end and add the new begin 'date 00:00:00.0'结束并添加新的开始 'date 00:00:00.0'
  • Fill 'info'填写“信息”

The final expected result:最终的预期结果:

begin                       end                          info
2019-10-25 10:39:58.352073  2019-10-25 10:40:06.266782   toto

2019-10-25 16:35:22.485574  2019-10-25 23:59:59.999999   tata
2019-10-26 00:00:00.000000  2019-10-26 23:59:59.999999   tata
2019-10-27 00:00:00.000000  2019-10-27 09:50:31.713179   tata

2019-10-27 09:50:31.713179  2019-10-27 09:50:31.713192   titi
2019-10-28 14:04:33.095633  2019-10-28 14:05:07.639344   tete

But I don't know how implement the date_range, fill info, add the specific number of rows.但我不知道如何实现日期范围,填充信息,添加具体的行数。

Thanks your time谢谢你的时间

Assuming begin and end are already of Timestamp type:假设beginend已经是Timestamp类型:

# Generate a series of Timedeltas for each row
n = (
    (df['end'].dt.normalize() - df['begin'].dt.normalize())
        .apply(lambda d: [pd.Timedelta(days=i) for i in range(d.days+1)])
        .explode()
).rename('n')
df = df.join(n)

# Adjust the begin and end of each row
adjusted_begin = np.max([
    df['begin'],
    df['begin'].dt.normalize() + df['n']
], axis=0)

adjusted_end = np.min([
    df['end'],
    pd.Series(adjusted_begin).dt.normalize() + pd.Timedelta(days=1, milliseconds=-100)
], axis=0)

# Final assembly
df = df.assign(begin_=adjusted_begin, end_=adjusted_end)

Result:结果:

                       begin                        end  info      n                     begin_                       end_
0 2019-10-25 10:39:58.352073 2019-10-25 10:40:06.266782  toto 0 days 2019-10-25 10:39:58.352073 2019-10-25 10:40:06.266782
1 2019-10-25 16:35:22.485574 2019-10-27 09:50:31.713179  tata 0 days 2019-10-25 16:35:22.485574 2019-10-25 23:59:59.900000
1 2019-10-25 16:35:22.485574 2019-10-27 09:50:31.713179  tata 1 days 2019-10-26 00:00:00.000000 2019-10-26 23:59:59.900000
1 2019-10-25 16:35:22.485574 2019-10-27 09:50:31.713179  tata 2 days 2019-10-27 00:00:00.000000 2019-10-27 09:50:31.713179
2 2019-10-27 09:50:31.713179 2019-10-27 09:50:31.713192  titi 0 days 2019-10-27 09:50:31.713179 2019-10-27 09:50:31.713192
3 2019-10-28 14:04:33.095633 2019-10-28 14:05:07.639344  tete 0 days 2019-10-28 14:04:33.095633 2019-10-28 14:05:07.639344

Trim off the columns you don't need修剪掉不需要的列

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM