简体   繁体   中英

pandas str to datetime conversion error (60 in minutes field) ParserError: minute must be in 0..59: 2015-04-24 12:60:46

I am trying to convert to datetime with given series of str in pandas with pd.to_datetime() . However, I have some str in which 60 is there in minutes field.

['2015-04-24 12:60:46',
 '2015-04-24 11:60:20',
 '2015-03-14 12:60:02',
 '2015-05-11 12:60:53',
 '2015-04-26 11:60:44',
 '2015-05-31 15:60:59',
 '2015-04-02 07:60:10',
 '2015-04-23 12:60:59',
 '2015-05-07 18:60:11',
 '2015-04-27 12:60:39',
 '2015-04-10 09:60:26',
 '2015-04-03 18:60:05',
 '2015-05-20 08:60:37',
 '2015-05-08 12:60:17',
 '2015-04-16 12:60:50',
 '2015-03-26 09:60:51',
 '2015-03-20 08:60:29',
 '2015-03-21 13:60:19',
 '2015-03-07 01:60:16',
 '2015-05-31 14:60:56',
 '2015-03-06 18:60:01',
 '2015-05-17 14:60:46',
 '2015-03-10 04:60:18',
 '2015-05-23 12:60:30',
 '2015-04-17 09:60:53',
 '2015-04-23 17:60:34',
 '2015-03-31 12:60:50',
.....]

Any idea how to solve this?

Split values by space by Series.str.split and then convert date part to datetimes and add time part converted to timedeltas by to_timedelta :

d = ['2015-04-24 12:60:46', '2015-04-24 11:60:20', 
     '2015-03-14 12:60:02', '2015-05-11 12:60:53']
df = pd.DataFrame({'date':d})

s = df['date'].str.split()

df['date'] = pd.to_datetime(s.str[0]) + pd.to_timedelta(s.str[1])
print (df)
                 date
0 2015-04-24 13:00:46
1 2015-04-24 12:00:20
2 2015-03-14 13:00:02
3 2015-05-11 13:00:53

Another idea if there is always 60 for all data, else use previous solution:

df['date'] = (pd.to_datetime(df['date'].replace(":60:",":59:", regex=True)) +  
              pd.Timedelta(1, 'min'))
print (df)
                 date
0 2015-04-24 13:00:46
1 2015-04-24 12:00:20
2 2015-03-14 13:00:02
3 2015-05-11 13:00:53

If you want to consider the 60 to be 0 , then replace ":60:" with ":00:" . If you want 60 to be zero seconds in the next minute, then increment the minute field if ":60:" is present, and do so before you do the replacement.

Two step process

  1. fix :60: by replacing it with :59:
  2. then convert to datetime
d = ['2015-04-24 12:60:46', '2015-04-24 11:60:20', '2015-03-14 12:60:02', '2015-05-11 12:60:53', '2015-04-26 11:60:44', '2015-05-31 15:60:59', '2015-04-02 07:60:10', '2015-04-23 12:60:59', '2015-05-07 18:60:11', '2015-04-27 12:60:39', '2015-04-10 09:60:26', '2015-04-03 18:60:05', '2015-05-20 08:60:37', '2015-05-08 12:60:17', '2015-04-16 12:60:50', '2015-03-26 09:60:51', '2015-03-20 08:60:29', '2015-03-21 13:60:19', '2015-03-07 01:60:16', '2015-05-31 14:60:56', '2015-03-06 18:60:01', '2015-05-17 14:60:46', '2015-03-10 04:60:18', '2015-05-23 12:60:30', '2015-04-17 09:60:53', '2015-04-23 17:60:34', '2015-03-31 12:60:50']

(pd.DataFrame({"Date":d})
 .assign(Date=lambda x: x["Date"].str.replace(":60:",":59:"))
 .assign(Date=lambda x: pd.to_datetime(x["Date"]))
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM