简体   繁体   中英

Change part of string in a file using regex in Python

I have a file in which each line contains one timestamp as a part of that line. The timestamp format is 1996-07-04 00:00:00.0 . I want to convert this to 1996-07-04 00:00:00 without the millisecond in each line. I tried using re.sub() method in pyhton but it replaces it with the value given by me and does not retain the original timestamp. I was using

re.sub("(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d:\d\d.\d)", "replace without millisec", cell)

The 2nd parameter is my problem.

You can use the following regex that will capture what you need to keep, and then use the backreference to restore it after a sub replacement:

\b(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})\.\d+\b

Replace with \\1 .

See demo

IDEONE code:

import re
p = re.compile(r'\b(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})\.\d+\b')
test_str = u"1996-07-04 00:00:00.0"
print re.sub(p, r"\1", test_str)

Note that you do not have to repeat the same subpatterns like \\d\\d\\d\\d , just use a limiting quantifier {n} where n is the number of times you need the subpattern to repeat. You can even set minimum and maximum boundaries like {1,4} , or just the minimum {2,} .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM