[英]How to define format when using pandas to_datetime?
I want to plot RESULT vs TIME based on a testresult.csv
file that has following format, and I have trouble to get the TIME column's datatype defined properly.我想根据具有以下格式的testresult.csv
文件绘制 RESULT 与 TIME 的关系图,但我无法正确定义 TIME 列的数据类型。
TIME,RESULT
03/24/2016 12:27:11 AM,2
03/24/2016 12:28:41 AM,76
03/24/2016 12:37:23 AM,19
03/24/2016 12:38:44 AM,68
03/24/2016 12:42:02 AM,44
...
To read the csv file, this is the code I wrote: raw_df = pd.read_csv('testresult.csv', index_col=None, parse_dates=['TIME'], infer_datetime_format=True)
读取csv文件,这是我写的代码: raw_df = pd.read_csv('testresult.csv', index_col=None, parse_dates=['TIME'], infer_datetime_format=True)
This code works, but it is extremely slow, and I assume that the infer_datetime_format
takes time.这段代码有效,但速度非常慢,我认为infer_datetime_format
需要时间。 So I tried to read in the csv by default first, and then convert the object dtype 'TIME' to datetime dtype by using to_datetime()
, and I hope by defining the format, it might expedite the speed.所以我尝试先默认读取 csv,然后使用to_datetime()
将对象 dtype 'TIME' 转换为 datetime to_datetime()
,我希望通过定义格式,可以加快速度。
raw_df = pd.read_csv('testresult.csv')
raw_df.loc['NEWTIME'] = pd.to_datetime(raw_df['TIME'], format='%m/%d%Y %-I%M%S %p')
This code complained error:此代码抱怨错误:
"ValueError: '-' is a bad directive in format '%m/%d%Y %-I%M%S %p'"
The format you are passing is invalid.您传递的格式无效。 The dash between the %
and the I
is not supposed to be there. %
和I
之间的破折号不应该在那里。
df['TIME'] = pd.to_datetime(df['TIME'], format="%m/%d/%Y %I:%M:%S %p")
This will convert your TIME
column to a datetime.这会将您的TIME
列转换为日期TIME
。
Alternatively, you can adjust your read_csv
call to do this:或者,您可以调整read_csv
调用来执行此操作:
pd.read_csv('testresult.csv', parse_dates=['TIME'],
date_parser=lambda x: pd.to_datetime(x, format='%m/%d/%Y %I:%M:%S %p'))
Again, this uses the appropriate format with out the extra -
, but it also passes in the format to the date_parser
parameter instead of having pandas attempt to guess it with the infer_datetime_format
parameter.同样,这使用了没有额外-
的适当格式,但它也会将格式传递给date_parser
参数,而不是让熊猫尝试使用infer_datetime_format
参数来猜测它。
you can try this:你可以试试这个:
In [69]: df = pd.read_csv(fn, parse_dates=[0],
date_parser=lambda x: pd.to_datetime(x, format='%m/%d/%Y %I:%M:%S %p'))
In [70]: df
Out[70]:
TIME RESULT
0 2016-03-24 00:27:11 2
1 2016-03-24 00:28:41 76
2 2016-03-24 00:37:23 19
3 2016-03-24 00:38:44 68
4 2016-03-24 00:42:02 44
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.