[英]Can you use datetime.strptime without knowing the format?
I am writing a function that takes 3 pandas Series, one of which is dates, and I need to be able to turn it into a dataframe where I can resample by them. 我正在编写一个函数,该函数需要3个pandas系列,其中一个是日期,而且我需要能够将其转换为一个数据框,以便我可以对其进行重新采样。 The issue, is that when I simply do the following:
问题是,当我简单地执行以下操作时:
>>> data.index = data.time
>>> df = data.resample('M')
I get the following error: 我收到以下错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/generic.py", line 234, in resample
return sampler.resample(self)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/tseries/resample.py", line 100, in resample
raise TypeError('Only valid with DatetimeIndex or PeriodIndex')
TypeError: Only valid with DatetimeIndex or PeriodIndex
I know this is because even though the index type is a datetime object, when going through with resampling, unless it is in the form datetime(x,x,x,x,x,x)
, It wont read it correctly. 我知道这是因为即使索引类型是datetime对象,在进行重采样时,除非它采用
datetime(x,x,x,x,x,x)
的形式,否则它不会正确读取。
So when I use it, my date data looks like this: 2011-12-16 08:09:07
, so I have been doing the following: 因此,当我使用它时,我的日期数据如下所示:
2011-12-16 08:09:07
,所以我一直在进行以下操作:
dates = data.time
date_objects = [datetime.strptime(dates[x], '%Y-%m-%d %H:%M:%S') for x in range(len(dates))]
data.index = date_objects
df = data.resample('M')
My issue is that I am using this for open source and I cannot know what format the dates will be when inputted. 我的问题是我将其用于开放源代码,我不知道输入时的日期格式。
So my question is: how can I turn a string with a date and a time to a datetime object WITHOUT knowing the way that string is formatted? 所以我的问题是:如何在不知道字符串格式化方式的情况下,将带有日期和时间的字符串转换为datetime对象?
You can use the dateutil
library for that purpose 您可以将
dateutil
库用于此目的
from dateutil import parser
yourdate = parser.parse(dates[x])
Pandas has a to_datetime
function for this purpose, and when applied to a Series it'll convert values to Timestamp rather than datetime: Pandas为此具有一个
to_datetime
函数,当应用于Series时,它将把值转换为Timestamp而不是datetime:
data.time = pd.to_datetime(data.time)
df = data.set_index('time')
Where: 哪里:
In [2]: pd.to_datetime('2011-12-16 08:09:07')
Out[2]: datetime.datetime(2011, 12, 16, 8, 9, 7)
In [3]: s = pd.Series(['2011-12-16 08:09:07'])
In [4]: pd.to_datetime(s)
Out[4]:
0 2011-12-16 08:09:07
dtype: datetime64[ns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.