简体   繁体   English

如何将其格式化为大熊猫中的日期?

[英]How can I format this into a date in pandas?

26JAN2015:14:42:03 2015年1月26日:14:42:03

How do I format that data properly in pandas as a date? 如何正确格式化日期中的熊猫数据? I have two columns in a raw file which have that format and I need them to be in date so I can subtract their values to measure the time in between. 我在原始文件中有两列具有该格式,因此我需要将它们保持在日期中,这样我就可以减去它们的值来测量其间的时间。

Also, for a quick sanity check. 此外,还可以进行快速检查。 When I am dealing with Dates (normally from Excel or .csv files), I am using code like this: 当我处理日期(通常是从Excel或.csv文件)时,我正在使用如下代码:

df['Start']= pd.to_datetime(df['Start'], coerce = True)

df['Date'] = df['Start'].apply(lambda x:x.date().strftime('%Y-%m-%d'))

df['TimeDelta'] = ((df['Start'] - df['End']).astype('timedelta64[s]'))/86400

First I do a pd.to_datetime to change the object data to a date format and then I use lambda commands to switch the formats to ISO standard typically. 首先,我执行pd.to_datetime将对象数据更改为日期格式,然后使用lambda命令将格式通常切换为ISO标准。 I also subtract two dates columns to get the time between and divide by 86400 seconds to turn it into days. 我还减去两个日期列以获取时间,然后除以86400秒将其转换为天。 Are these the most efficient commands to do this with? 这些是最有效的命令吗?

Call to_datetime and pass the format string: 调用to_datetime并传递格式字符串:

In [114]:

df = pd.DataFrame({'date':['26Jan2015:14:42:03']})
df['date'] = pd.to_datetime(df['date'], format='%d%b%Y:%H:%M:%S')
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1 entries, 0 to 0
Data columns (total 1 columns):
date    1 non-null datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 16.0 bytes
In [115]:

df
Out[115]:
                 date
0 2015-01-26 14:42:03

One more variant is by using regex 另一种变体是使用正则表达式

import re
dat = "26JAN2015:14:42:03"
dat = re.match("(\d+)(\D+)(\d+):(\d+):(\d+):(\d+)", dat)

print dat.groups()

>>> ('26', 'JAN', '2015', '14', '42', '03')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM