Replacing non-alphanumeric characters in regex match using Python

Question

I have a text file (verilog) that contains certain string sequences (escaped identifiers) that I want to modify. In the example below, I want to find any group starting with '\\' and ending with ' ' (any printable character can be in between). After finding a group that matches this criteria, I want to replace all non-alphanumeric characters with alphanumeric ones (I don't really care what alphanumeric they get replaced with).

In[1]:  here i$ \$0me text to \m*dify
Out[1]: here i$ aame text to madify

I have no problem finding the groups that need replacing using regex. However, if I just use re.findAll(), I no longer have the location of the words in the string to reconstruct the string after modifying the matched groups.

Is there a way to preserve the location of the words in the string while modifying each match separately?

Note: I previously asked a very similar question here , but I oversimplified my example. I thought editing my existing question would make the existing comments and answers confusing to future readers.

Answer 1

My answer to your previous question still applies, with some minor modifications. Only the regex changes.

Since this is more complex, define a function to pass as a callback.

In [57]: def foo(m):
    ...:     return ''.join(x if re.match('[a-zA-Z]', x)\
                              else ('' if x == '\\' else 'a') for x in m.group())

Now, call re.sub :

In [58]: re.sub(r'\\.*?(?= |$)', foo, text)
Out[58]: 'here i$ aame text to madify'

Replacing non-alphanumeric characters in regex match using Python

Question

1 answers

solution1
1 ACCPTED 2017-08-04 15:29:59

Replacing non-alphanumeric characters in regex match using Python

Question

1 answers

solution1 1 ACCPTED 2017-08-04 15:29:59

solution1
1 ACCPTED 2017-08-04 15:29:59