简体   繁体   English

Pandas:选择两个日期之间的 DataFrame 行(日期时间索引)

[英]Pandas: Selecting DataFrame rows between two dates (Datetime Index)

I have a Pandas DataFrame with a DatetimeIndex and one column MSE Loss the index is formatted as follows:我有一个 Pandas DataFrame 带有 DatetimeIndex 和一列MSE Loss索引的格式如下:

DatetimeIndex(['2015-07-16 07:14:41', '2015-07-16 07:14:48',
           '2015-07-16 07:14:54', '2015-07-16 07:15:01',
           '2015-07-16 07:15:07', '2015-07-16 07:15:14',...]

It includes several days.它包括几天。

I want to select all the rows (all times) of a particular days without specifically knowing the actual time intervals.我想 select 特定日期的所有行(所有时间),但不知道实际的时间间隔。 For example: Between 2015-07-16 07:00:00 and 2015-07-16 23:00:00例如: 2015-07-16 07:00:002015-07-16 23:00:00

I tried the approach outlined here: here我尝试了此处概述的方法: here

But df[date_from:date_to]但是df[date_from:date_to]

outputs:输出:

KeyError: Timestamp('2015-07-16 07:00:00')

So it wants exact indices.所以它需要精确的索引。 Furthermore, I don't have a date column.此外,我没有date列。 Only an index with the dates.只有带有日期的索引。

What is the best way to select a whole day by just providing a date 2015-07-16 and then how could I select a specific time range within a particular day?仅通过提供日期2015-07-16到 select 一整天的最佳方法是什么,然后我怎么能 select 在特定日期内的特定时间范围内?

Option 1 : 选项1

Sample df: 样本df:

df
                      a
2015-07-16 07:14:41  12
2015-07-16 07:14:48  34
2015-07-16 07:14:54  65
2015-07-16 07:15:01  34
2015-07-16 07:15:07  23
2015-07-16 07:15:14   1

It looks like you're trying this without .loc (won't work without it): 看起来你在没有.loc情况下尝试这个(没有它就行不通):

df.loc['2015-07-16 07:00:00':'2015-07-16 23:00:00']
                      a
2015-07-16 07:14:41  12
2015-07-16 07:14:48  34
2015-07-16 07:14:54  65
2015-07-16 07:15:01  34
2015-07-16 07:15:07  23
2015-07-16 07:15:14   1

Option 2 : 选项2

You can use boolean indexing on the index: 您可以在索引上使用布尔索引:

df[(df.index.get_level_values(0) >= '2015-07-16 07:00:00') & (df.index.get_level_values(0) <= '2015-07-16 23:00:00')]

You can use truncate : 你可以使用truncate

begin = pd.Timestamp('2015-07-16 07:00:00')
end = pd.Timestamp('2015-07-16 23:00:00')

df.truncate(before=begin, after=end)

You can use the panda function between_time .您可以使用熊猫 function between_time between_time

the_timed_df=df["my_time_column"].between_time(date_from,date_to)

Should do what you want if I did not mess some detail up:-)如果我没有搞砸一些细节,应该做你想做的事:-)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM