简体   繁体   中英

Replace anything not matching regex with symbol

I want to replace all characters that do NOT match a regex with a symbol, such as '#'. I have something such as:

def repl(m):
    return '#' * len(m.group())

re.sub(some_regex, repl, some_string)

This allows me to replace any characters that match the regex with the same length of '#', but is there an easy way to modify this to change everything that doesn't match some_regex to '#'?

Use re.split() to split the input using the regexp as a delimiter. Then replace all the delimited strings with the character and join them back into a string.

The regexp needs to put a capturing group around the pattern so that the delimiters will be included into the result. And if you need other groups in the regexp, they should be non-capturing groups.

import re

def repl(s):
    return '#' * len(s)

some_string = 'This is text blah, other text blahhhh and more text'
some_regex = r'(blah+)'

split = re.split(some_regex, some_string)
for i in range(0, len(split), 2): # every other element is between the matches
    split[i] = repl(split[i])

new_string = ''.join(split)
print(new_string)

You could start with a string full of # , then put all the matches back in their correct positions:

a = "abcdef"
b = list('#'*len(a))

for match in re.finditer(r'c', a):
    b[match.start():match.end()] = match[0]

print(''.join(b))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM