Python：如何在pandas 0.9.0上开发一个between_time类似的方法？

Question

I am stick to pandas 0.9.0 as I'm working under python 2.5, hence I have no between_time method available. 当我在python 2.5下工作时，我坚持使用pandas 0.9.0，因此我没有可用的between_time方法。

I have a DataFrame of dates and would like to filter all the dates that are between certain hours, eg between 08:00 and 09:00 for all the dates within the DataFrame df . 我有一个日期的DataFrame，并希望过滤DataFrame df所有日期的特定时间之间的所有日期，例如08:00到09:00之间。

import pandas as pd
import numpy as np
import datetime

dates = pd.date_range(start="08/01/2009",end="08/01/2012",freq="10min")
df = pd.DataFrame(np.random.rand(len(dates), 1)*1500, index=dates, columns=['Power'])

How can I develop a method that provides same functionality as between_time method? 如何开发一种提供与between_time方法相同功能的方法？

NB: The original problem I am trying to accomplish is under Python: Filter DataFrame in Pandas by hour, day and month grouped by year 注意：我想要完成的原始问题是在Python下：在Pandas中过滤DataFrame按小时，日期和月份按年份分组

Answer 1

UPDATE: 更新：

try to use: 尝试使用：

df.loc[df.index.indexer_between_time('08:00','09:50')]

OLD answer: 老答案：

I'm not sure that it'll work on Pandas 0.9.0, but it's worth to try it: 我不确定它是否适用于Pandas 0.9.0，但值得尝试一下：

df[(df.index.hour >= 8) & (df.index.hour <= 9)]

PS please be aware - it's not the same as between_time as it checks only hours and between_time is able to check time like df.between_time('08:01:15','09:13:28') PS请注意-这是不一样的between_time ，因为它只会检查小时between_time能够检查时间像df.between_time('08:01:15','09:13:28')

Hint : download a source code for a newer version of Pandas and take a look at the definition of indexer_between_time() function in pandas/tseries/index.py - you can clone it for your needs 提示：下载更新版Pandas的源代码，并查看pandas/tseries/index.py中indexer_between_time()函数的定义 - 您可以根据需要克隆它

UPDATE: starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers . 更新：从Pandas 0.20.1开始， .ix索引器已弃用，支持更严格的.iloc和.loc索引器。

Answer 2

Here is a NumPy-based way of doing it: 以下是基于NumPy的方法：

import pandas as pd
import numpy as np
import datetime

dates = pd.date_range(start="08/01/2009",end="08/01/2012",freq="10min")
df = pd.DataFrame(np.random.rand(len(dates), 1)*1500, index=dates, columns=['Power'])

epoch = np.datetime64('1970-01-01')
start = np.datetime64('1970-01-01 08:00:00')
end = np.datetime64('1970-01-01 09:00:00')

# convert the dates to a NumPy datetime64 array
date_array = df.index.asi8.astype('<M8[ns]') 

# replace the year/month/day with 1970-01-01
truncated = (date_array - date_array.astype('M8[D]')) + epoch

# compare the hour/minute/seconds etc with `start` and `end`
mask = (start <= truncated) & (truncated <=end)

print(df[mask])

yields 产量

                           Power
2009-08-01 08:00:00  1007.289466
2009-08-01 08:10:00   770.732422
2009-08-01 08:20:00   617.388909
2009-08-01 08:30:00  1348.384210
...
2012-07-31 08:30:00   999.133350
2012-07-31 08:40:00  1451.500408
2012-07-31 08:50:00  1161.003167
2012-07-31 09:00:00   670.545371

Python：如何在pandas 0.9.0上开发一个between_time类似的方法？

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-10-19 18:06:52

解决方案2
1 2016-10-19 18:37:35

Python：如何在pandas 0.9.0上开发一个between_time类似的方法？

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-10-19 18:06:52

解决方案2 1 2016-10-19 18:37:35

解决方案1
2 已采纳 2016-10-19 18:06:52

解决方案2
1 2016-10-19 18:37:35