简体   繁体   中英

python regex matching between two wild characters

I have a file that contains some lines with the following format.

...
...
ABC_DEF( ac, bad, dd, ..)
...
...

I want to grep for the ac and bad from the ABC_DEF and modify the file such that ..

...
...
ac, bad, 
ABC_DEF(dd, ...)
...
...

ac and bad are just examples, it will be alpha numeric characters with some size.

I have the following code in python

import re
for line in fileinput.input(inplace=1):
    line = re.sub(r'ABC_DEF\(\w+,\w+,', r'ABC_DEF(', line.rstrip())
    print(line)

But this does not seem to work. Can someone please help.

Thanks,

I think you need

line = re.sub(r'ABC_DEF\(\s*\w+\s*,\s*\w+\s*,\s*', r'ABC_DEF(', line.rstrip())

because there could be spaces around the words.

>>> line = 'ABC_DEF(  first ,  second   , third, fourth)'
>>> line = re.sub(r'ABC_DEF\(\s*\w+\s*,\s*\w+\s*,\s*', 
r'ABC_DEF(', line.rstrip())
>>> line
'ABC_DEF(third, fourth)'

UPDATE: You asked in the comments that you wanted to know how to capture the values. You do this by putting parens on the parts you want to capture and then call re.match instead. Like this:

>>> line = 'ABC_DEF(  first ,  second   , third, fourth)'
>>> match = re.match(r'ABC_DEF\(\s*(\w+)\s*,\s*(\w+)\s*,\s*', line)
>>> match.group(1)
'first'
>>> match.group(2)
'second'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM