[英]pandas: selecting rows in a specific time window
I have a dataset of samples covering multiple days, all with a timestamp. 我有一个涵盖多天的样本数据集,都有时间戳。 I want to select rows within a specific time window.
我想在特定时间窗口内选择行。 Eg all rows that were generated between 1pm and 3 pm every day.
例如,每天下午1点到3点之间生成的所有行。
This is a sample of my data in a pandas dataframe: 这是我在pandas数据帧中的数据样本:
22 22 2018-04-12T20:14:23Z 2018-04-12T21:14:23Z 0 6370.1
23 23 2018-04-12T21:14:23Z 2018-04-12T21:14:23Z 0 6368.8
24 24 2018-04-12T22:14:22Z 2018-04-13T01:14:23Z 0 6367.4
25 25 2018-04-12T23:14:22Z 2018-04-13T01:14:23Z 0 6365.8
26 26 2018-04-13T00:14:22Z 2018-04-13T01:14:23Z 0 6364.4
27 27 2018-04-13T01:14:22Z 2018-04-13T01:14:23Z 0 6362.7
28 28 2018-04-13T02:14:22Z 2018-04-13T05:14:22Z 0 6361.0
29 29 2018-04-13T03:14:22Z 2018-04-13T05:14:22Z 0 6359.3
.. ... ... ... ... ...
562 562 2018-05-05T08:13:21Z 2018-05-05T09:13:21Z 0 6300.9
563 563 2018-05-05T09:13:21Z 2018-05-05T09:13:21Z 0 6300.7
564 564 2018-05-05T10:13:14Z 2018-05-05T13:13:14Z 0 6300.2
565 565 2018-05-05T11:13:14Z 2018-05-05T13:13:14Z 0 6299.9
566 566 2018-05-05T12:13:14Z 2018-05-05T13:13:14Z 0 6299.6
How do I achieve that? 我如何实现这一目标? I need to ignore the date and just evaluate the time component.
我需要忽略日期并只评估时间组件。 I could traverse the dataframe in a loop and evaluate the date time in that way, but there must be a more simple way to do that..
我可以在循环中遍历数据帧并以这种方式评估日期时间,但必须有一个更简单的方法来做到这一点。
I converted the messageDate which was read aa string to a dateTime by 我将messageDate转换为dateTime,将其转换为dateTime
df["messageDate"]=pd.to_datetime(df["messageDate"])
But after that I got stuck on how to filter on time only. 但之后我就陷入了如何仅按时过滤的问题。
Any input appreciated. 任何输入赞赏。
datetime
columns have DatetimeProperties
object, from which you can extract datetime.time
and filter on it: datetime
列具有DatetimeProperties
对象,您可以从中提取datetime.time
并对其进行过滤:
import datetime
df = pd.DataFrame(
[
'2018-04-12T12:00:00Z', '2018-04-12T14:00:00Z','2018-04-12T20:00:00Z',
'2018-04-13T12:00:00Z', '2018-04-13T14:00:00Z', '2018-04-13T20:00:00Z'
],
columns=['messageDate']
)
df
messageDate
# 0 2018-04-12 12:00:00
# 1 2018-04-12 14:00:00
# 2 2018-04-12 20:00:00
# 3 2018-04-13 12:00:00
# 4 2018-04-13 14:00:00
# 5 2018-04-13 20:00:00
df["messageDate"] = pd.to_datetime(df["messageDate"])
time_mask = (df['messageDate'].dt.hour >= 13) & \
(df['messageDate'].dt.hour <= 15)
df[time_mask]
# messageDate
# 1 2018-04-12 14:00:00
# 4 2018-04-13 14:00:00
I hope the code is self explanatory. 我希望代码是自我解释的。 You can always ask questions.
你总是可以提出问题。
import pandas as pd
# Prepping data for example
dates = pd.date_range('1/1/2018', periods=7, freq='H')
data = {'A' : range(7)}
df = pd.DataFrame(index = dates, data = data)
print df
# A
# 2018-01-01 00:00:00 0
# 2018-01-01 01:00:00 1
# 2018-01-01 02:00:00 2
# 2018-01-01 03:00:00 3
# 2018-01-01 04:00:00 4
# 2018-01-01 05:00:00 5
# 2018-01-01 06:00:00 6
# Creating a mask to filter the value we with to have or not.
# Here, we use df.index because the index is our datetime.
# If the datetime is a column, you can always say df['column_name']
mask = (df.index > '2018-1-1 01:00:00') & (df.index < '2018-1-1 05:00:00')
print mask
# [False False True True True False False]
df_with_good_dates = df.loc[mask]
print df_with_good_dates
# A
# 2018-01-01 02:00:00 2
# 2018-01-01 03:00:00 3
# 2018-01-01 04:00:00 4
df=df[(df["messageDate"].apply(lambda x : x.hour)>13) & (df["messageDate"].apply(lambda x : x.hour)<15)]
您可以类似地使用x.minute,x.second。
try this after ensuring messageDate is indeed datetime format as you have done 在确保messageDate确实是日期时间格式之后尝试这一点
df.set_index('messageDate',inplace=True)
choseInd = [ind for ind in df.index if (ind.hour>=13)&(ind.hour<=15)]
df_select = df.loc[choseInd]
you can do the same, even without making the datetime column as an index, as the answer with apply: lambda shows 即使没有将datetime列作为索引,你也可以这样做,作为apply:lambda显示的答案
it just makes your dataframe 'better looking' if the datetime is your index rather than numerical one. 如果日期时间是您的索引而不是数字时间,它只会使您的数据框“更好看”。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.