[英]Iteration through a range of timestamp in Python
I have a dataframe df : 我有一个数据帧df:
TIMESTAMP equipement1 equipement2
2016-05-10 13:20:00 0.000000 0.000000
2016-05-10 14:40:00 0.400000 0.500000
2016-05-10 15:20:00 0.500000 0.500000
Iam trying to iterate through timestamp by step of 5 minutes . 我试图通过5分钟的步骤迭代时间戳。 I try :
pd.date_range(start, end, freq='5 minutes')
我尝试:
pd.date_range(start, end, freq='5 minutes')
But I get a problem with timestamp format. 但我遇到了时间戳格式的问题。
" ValueError: Could not evaluate 5 minutes"
“ValueError:无法评估5分钟”
Any idea to help me to resolve this problem? 有什么想法帮我解决这个问题吗?
Thank you 谢谢
First, make sure your TIMESTAMP column is a datetime instead of a string (eg df['TIMESTAMP'] = pd.to_datetime(df.TIMESTAMP)
). 首先,确保您的TIMESTAMP列是日期时间而不是字符串(例如
df['TIMESTAMP'] = pd.to_datetime(df.TIMESTAMP)
)。
Next, use this column as the index of the dataframe. 接下来,使用此列作为数据帧的索引。 To make this permanent,
df.set_index('TIMESTAMP
, inplace=True)`. 要使这个永久化,
df.set_index('TIMESTAMP
, df.set_index('TIMESTAMP
= True)`。
Now you can resample for any given frequency (eg 30min
) and use different methods of aggregation such as sum
, mean
(the default), a lambda function, etc). 现在,您可以对任何给定频率(例如
30min
) 重新采样 ,并使用不同的聚合方法,例如sum
, mean
(默认值),lambda函数等)。
Optionally, you can add .fillna(0)
to replace the NaNs with zeros. (可选)您可以添加
.fillna(0)
以用零替换NaN。
>>> df.set_index('TIMESTAMP').resample('30min', how='sum')
equipement1 equipement2
TIMESTAMP
2016-05-10 13:00:00 0.0 0.0
2016-05-10 13:30:00 NaN NaN
2016-05-10 14:00:00 NaN NaN
2016-05-10 14:30:00 0.4 0.5
2016-05-10 15:00:00 0.5 0.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.