简体   繁体   中英

Regex: Adding a space in wrongly formatted datetime string

I am trying to replace a wrongly formatted datetime String object with a space in between its 'day' and 'hours' sections (as seen in data below).

The data below are the incorrectly formatted ones which I have used regex to sieve out. I can't crack my head over how to include a space in between. Wondering if I could do it through regex? New to regex, would appreciate if anyone could share their experiences!

df[df['time'].str.contains(r'\.[0-9][0-9][0-9][0-9]\:')]['time']
20192    2010.12.2512:01:45
20193    2010.12.2512:02:52
20196    2010.12.2512:07:43
20197    2010.12.2512:25:33
42811    2013.01.0103:04:10
43527    2013.05.1605:49:49

Your regex isn't that far off; you just need to add matching groups (see eg here ) - between which you could place the missing space. Eg like

import pandas as pd

df = pd.DataFrame({'time': ['2010.12.2512:01:45', '2010.12.2512:02:52', '2010.12.25 12:07:43']})

df['time'].replace('(\.[0-9]{2})([0-9]{2}\:)', r'\1 \2', regex=True)
# 0    2010.12.25 12:01:45
# 1    2010.12.25 12:02:52
# 2    2010.12.25 12:07:43
# Name: time, dtype: object

Note that I've added one "valid" string at the end of the list. It's not modified by the regex replace.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM