I want to look ahead for missing ')' and add them with re.sub but I get strange results when using negative looka ahead:
a='D, M, departementsråd (fr.o.m. 2018-11-22 t.o.m. 2021-09-30 E, A, chef för Statens haverikommission (fr.o.m. 1997-07-01 t.o.m. 1997-09-07)'
re.sub(r'(t\.o\.m\.\s*\d{4}-\d{1,2}-\d{1,2})(?!\))',r'\1\)',a)
result:
D, M, departementsråd (fr.o.m. 2018-11-22 t.o.m. 2021-09-30\\) E, A, chef för Statens haverikommission (fr.o.m. 1997-07-01 t.o.m. 1997-09-0\\)7)
what I want:
D, M, departementsråd (fr.o.m. 2018-11-22 t.o.m. 2021-09-30) E, A, chef för Statens haverikommission (fr.o.m. 1997-07-01 t.o.m. 1997-09-0)
I want to add the missing )
in tom 2021-09-30
but it doesn't work.
You get that result because the \\d{1,2}
leaves paths to explore using backtracking due to the {1,2}
This part \\d{1,2}(?!\\))
will match 1 or 2 digits asserting what is directly on the right is not )
which it can match for 0
in 07)
What you might do is use a word boundary \\d{1,2}\\b
t\.o\.m\.\s*\d{4}-\d{1,2}-\d{1,2}\b(?!\))
In the replacement you could use the full match instead of using group 1
\g<0>)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.