简体   繁体   English

日期时间转换ValueError Pandas

[英]datetime conversion ValueError Pandas

I have a dataset with wrong times (24:00:00 to 26:18:00) I wanted to know what is the best approach to deal with this kind of data in python.我有一个时间错误的数据集(24:00:00 到 26:18:00) 我想知道在 python 中处理此类数据的最佳方法是什么。

I tried to convert the column from object to datetime using this code:我尝试使用以下代码将列从 object 转换为datetime时间:

stopTimeArrDep['departure_time'] =  pd.to_datetime(stopTimeArrDep['departure_time']\
                                                   ,format='%H:%M:%S')

But I get this error:但我得到这个错误:

ValueError: time data '24:04:00' does not match format '%H:%M:%S' (match)

So I tried adding errors='coerce' to avoid this error.所以我尝试添加errors='coerce'来避免这个错误。 But I end up with empty columns and unwanted date added to every row.但我最终得到了空列和添加到每一行的不需要的日期。

stopTimeArrDep['departure_time'] =  pd.to_datetime(stopTimeArrDep['departure_time']\
                                                   ,format='%H:%M:%S',errors='coerce')

output sample: output 样品:

original_col    converted_col
23:45:00        1/1/00 23:45:00
23:51:00        1/1/00 23:51:00
24:04:00
23:42:00        1/1/00 23:42:00
26:01:00

Any suggestion on what is the best approach to handle this issue.关于处理此问题的最佳方法的任何建议。 Thank you,谢谢,

Solution解决方案

You could treat the original_col as some elapsed time interval and not time, if that makes any sense.如果有任何意义,您可以将original_col视为经过的时间间隔而不是时间。 You could use datetime.timedelta and then add this datetime.timedelta to a datetime.datetime to get some datetime object;您可以使用datetime.timedelta然后将此datetime.timedelta添加到datetime.datetime以获得一些日期时间 object; which you could finally use to get the date and time separately.您最终可以使用它来分别获取日期和时间。

Example例子

from datetime import datetime, timedelta

time_string = "20:30:20"

t = datetime.utcnow()
print('t: {}'.format(t))
HH, MM, SS = [int(x) for x in time_string.split(':')]
dt = timedelta(hours=HH, minutes=MM, seconds=SS)
print('dt: {}'.format(dt))
t2 = t + dt
print('t2: {}'.format(t2))
print('t2.date: {} | t2.time: {}'.format(str(t2.date()), str(t2.time()).split('.')[0]))

Output : Output

t: 2019-10-24 04:43:08.255027
dt: 20:30:20
t2: 2019-10-25 01:13:28.255027
t2.date: 2019-10-25 | t2.time: 01:13:28

For Your Usecase为您的用例

# Define Custom Function
def process_row(time_string):
    HH, MM, SS = [int(x) for x in time_string.split(':')]
    dt = timedelta(hours=HH, minutes=MM, seconds=SS)
    return dt

# Make Dummy Data
original_col = ["23:45:00", "23:51:00", "24:04:00", "23:42:00", "26:01:00"]
df = pd.DataFrame({'original_col': original_col, 'dt': None})

# Process Dataframe
df['dt'] = df.apply(lambda x: process_row(x['original_col']), axis=1)
df['t'] = datetime.utcnow()
df['t2'] = df['dt'] + df['t']
# extracting date from timestamp
df['Date'] = [datetime.date(d) for d in df['t2']] 
# extracting time from timestamp
df['Time'] = [datetime.time(d) for d in df['t2']] 
df

Output : Output
在此处输入图像描述

Using pandas.to_datetime() :使用pandas.to_datetime()

pd.to_datetime(df['t2'], format='%H:%M:%S',errors='coerce')

Output : Output

0   2019-10-25 09:38:39.349410
1   2019-10-25 09:44:39.349410
2   2019-10-25 09:57:39.349410
3   2019-10-25 09:35:39.349410
4   2019-10-25 11:54:39.349410
Name: t2, dtype: datetime64[ns]

References参考

  1. How to construct a timedelta object from a simple string 如何从一个简单的字符串构造一个 timedelta object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM