简体   繁体   English

每日时间表作为熊猫的索引

[英]Daily schedule as an index in Pandas

I want to represent a daily schedule, given originally as a CSV file, as a Pandas DataFrame. 我想将最初作为CSV文件给出的每日时间表表示为Pandas DataFrame。 The key to each row in the schedule is an hourly range in a day. 时间表中每一行的关键是一天中的每小时范围。 The ranges are not overlapping. 范围不重叠。 For example: 例如:

00:00, 01:00, some data
01:00, 03:00, some more data
03:00, 04:30, some other data

How can I create a data frame with one level of the index representing the start-to-end hours range? 如何创建一个数据框架,其中一个索引级别代表开始到结束的小时数范围?

Starting from your example dataframe (put column names on it): 从示例数据框开始(在其上放置列名称):

In [78]: df
Out[78]: 
   start    end            other
0  00:00  01:00        some data
1  01:00  03:00   some more data
2  03:00  04:30  some other data

Assuming start and end are strings, we can convert it to a datetime with to_datetime . 假设开始和结束是字符串,我们可以使用to_datetime将其转换为日期to_datetime This will use a default date as the data are only hours: 这将使用默认日期,因为数据仅为小时:

In [79]: pd.to_datetime(df['end'], format='%H:%M')
Out[79]: 
0   1900-01-01 01:00:00
1   1900-01-01 03:00:00
2   1900-01-01 04:30:00
Name: end, dtype: datetime64[ns]

But assuming the start and end are always on the same day, this default date does not matter if we just use the datetime to calculate the time difference between start and end: 但是,假设开始和结束始终在同一天,那么默认日期与我们是否仅使用datetime来计算开始和结束之间的时差无关紧要:

In [80]: df['range'] = pd.to_datetime(df['end'], format='%H:%M') - pd.to_datetime(df['start'], format='%H:%M')


In [81]: df
Out[81]: 
   start    end            other    range
0  00:00  01:00        some data 01:00:00
1  01:00  03:00   some more data 02:00:00
2  03:00  04:30  some other data 01:30:00

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM