[英]Python: How to develop a between_time similar method when on pandas 0.9.0?
I am stick to pandas 0.9.0 as I'm working under python 2.5, hence I have no between_time method available. 当我在python 2.5下工作时,我坚持使用pandas 0.9.0,因此我没有可用的between_time方法。
I have a DataFrame of dates and would like to filter all the dates that are between certain hours, eg between 08:00
and 09:00
for all the dates within the DataFrame df
. 我有一个日期的DataFrame,并希望过滤DataFrame df
所有日期的特定时间之间的所有日期,例如08:00
到09:00
之间。
import pandas as pd
import numpy as np
import datetime
dates = pd.date_range(start="08/01/2009",end="08/01/2012",freq="10min")
df = pd.DataFrame(np.random.rand(len(dates), 1)*1500, index=dates, columns=['Power'])
How can I develop a method that provides same functionality as between_time
method? 如何开发一种提供与between_time
方法相同功能的方法?
NB: The original problem I am trying to accomplish is under Python: Filter DataFrame in Pandas by hour, day and month grouped by year 注意:我想要完成的原始问题是在Python下:在Pandas中过滤DataFrame按小时,日期和月份按年份分组
UPDATE: 更新:
try to use: 尝试使用:
df.loc[df.index.indexer_between_time('08:00','09:50')]
OLD answer: 老答案:
I'm not sure that it'll work on Pandas 0.9.0, but it's worth to try it: 我不确定它是否适用于Pandas 0.9.0,但值得尝试一下:
df[(df.index.hour >= 8) & (df.index.hour <= 9)]
PS please be aware - it's not the same as between_time
as it checks only hours and between_time
is able to check time like df.between_time('08:01:15','09:13:28')
PS请注意-这是不一样的between_time
,因为它只会检查小时between_time
能够检查时间像df.between_time('08:01:15','09:13:28')
Hint : download a source code for a newer version of Pandas and take a look at the definition of indexer_between_time()
function in pandas/tseries/index.py
- you can clone it for your needs 提示 :下载更新版Pandas的源代码,并查看pandas/tseries/index.py
中indexer_between_time()
函数的定义 - 您可以根据需要克隆它
UPDATE: starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers . 更新:从Pandas 0.20.1开始, .ix索引器已弃用,支持更严格的.iloc和.loc索引器 。
Here is a NumPy-based way of doing it: 以下是基于NumPy的方法:
import pandas as pd
import numpy as np
import datetime
dates = pd.date_range(start="08/01/2009",end="08/01/2012",freq="10min")
df = pd.DataFrame(np.random.rand(len(dates), 1)*1500, index=dates, columns=['Power'])
epoch = np.datetime64('1970-01-01')
start = np.datetime64('1970-01-01 08:00:00')
end = np.datetime64('1970-01-01 09:00:00')
# convert the dates to a NumPy datetime64 array
date_array = df.index.asi8.astype('<M8[ns]')
# replace the year/month/day with 1970-01-01
truncated = (date_array - date_array.astype('M8[D]')) + epoch
# compare the hour/minute/seconds etc with `start` and `end`
mask = (start <= truncated) & (truncated <=end)
print(df[mask])
yields 产量
Power
2009-08-01 08:00:00 1007.289466
2009-08-01 08:10:00 770.732422
2009-08-01 08:20:00 617.388909
2009-08-01 08:30:00 1348.384210
...
2012-07-31 08:30:00 999.133350
2012-07-31 08:40:00 1451.500408
2012-07-31 08:50:00 1161.003167
2012-07-31 09:00:00 670.545371
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.