简体   繁体   中英

Replacing multiple chars in string with another character in Python

I have a list of strings I want to check if each string contains certain characters, if it does then replace the characters with another character.

I have something like below:

invalid_chars = [' ', ',', ';', '{', '}', '(', ')', '\\n', '\\t', '=']
word = 'Ad{min > HR'
for c in list(word):
  if c in invalid_chars:
    word = word.replace(c, '_')
print (word) 

>>> Admin_>_HR

I am trying to convert this into a function using list comprehension but I am strange characters...

def replace_chars(word, checklist, char_replace = '_'):
  return ''.join([word.replace(ch, char_replace) for ch in list(word) if ch in checklist])

print(replace_chars(word, invalid_chars))
>>> Ad_min > HRAd{min_>_HRAd{min_>_HR

Try this general pattern:

''.join([ch if ch not in invalid_chars else '_' for ch in word])

For the complete function:

def replace_chars(word, checklist, char_replace = '_'):
  return ''.join([ch if ch not in checklist else char_replace for ch in word])

Note: no need to wrap string word in a list() , it's already iterable.

This might be a good use for str.translate() . You can turn your invalid_chars into a translation table with str.maketrans() and apply wherever you need it:

invalid_chars = [' ', ',', ';', '{', '}', '(', ')', '\n', '\t', '=']
invalid_table = str.maketrans({k:'_' for k in invalid_chars})

word = 'Ad{min > HR'

word.translate(invalid_table)

Result:

'Ad_min_>_HR'

This will be especially nice if you need to apply this translation to several strings and more efficient since you don't need to loop through the entire invalid_chars array for every letter, every time which you will if you us if x in invalid_chars inside a loop.

This is easier with regex. You can search for a whole group of characters with a single substitution call. It should perform better too.

>>> import re
>>> re.sub(f"[{re.escape(''.join(invalid_chars))}]", "_", word)
'Ad_min_>_HR'

The code in the f-string builds a regex pattern that looks like this

>>> pattern = f"[{re.escape(''.join(invalid_chars))}]"
>>> print(repr(pattern))
'[\\ ,;\\{\\}\\(\\)\\\n\\\t=]'
>>> print(pattern)
[\ ,;\{\}\(\)\
\   =]

That is, a regex character set containing each of your invalid chars. (The backslash escaping ensures that none of them are interpreted as a regex control character, regardless of which characters you put in invalid_chars .) If you had specified them as a string in the first place, the ''.join() would not be required.

You can also compile the pattern (using re.compile() ) if you need to re-use it on multiple words.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM