I have a list of strings I want to check if each string contains certain characters, if it does then replace the characters with another character.
I have something like below:
invalid_chars = [' ', ',', ';', '{', '}', '(', ')', '\\n', '\\t', '=']
word = 'Ad{min > HR'
for c in list(word):
if c in invalid_chars:
word = word.replace(c, '_')
print (word)
>>> Admin_>_HR
I am trying to convert this into a function using list comprehension but I am strange characters...
def replace_chars(word, checklist, char_replace = '_'):
return ''.join([word.replace(ch, char_replace) for ch in list(word) if ch in checklist])
print(replace_chars(word, invalid_chars))
>>> Ad_min > HRAd{min_>_HRAd{min_>_HR
Try this general pattern:
''.join([ch if ch not in invalid_chars else '_' for ch in word])
For the complete function:
def replace_chars(word, checklist, char_replace = '_'):
return ''.join([ch if ch not in checklist else char_replace for ch in word])
Note: no need to wrap string word
in a list()
, it's already iterable.
This might be a good use for str.translate()
. You can turn your invalid_chars
into a translation table with str.maketrans()
and apply wherever you need it:
invalid_chars = [' ', ',', ';', '{', '}', '(', ')', '\n', '\t', '=']
invalid_table = str.maketrans({k:'_' for k in invalid_chars})
word = 'Ad{min > HR'
word.translate(invalid_table)
Result:
'Ad_min_>_HR'
This will be especially nice if you need to apply this translation to several strings and more efficient since you don't need to loop through the entire invalid_chars
array for every letter, every time which you will if you us if x in invalid_chars
inside a loop.
This is easier with regex. You can search for a whole group of characters with a single substitution call. It should perform better too.
>>> import re
>>> re.sub(f"[{re.escape(''.join(invalid_chars))}]", "_", word)
'Ad_min_>_HR'
The code in the f-string builds a regex pattern that looks like this
>>> pattern = f"[{re.escape(''.join(invalid_chars))}]"
>>> print(repr(pattern))
'[\\ ,;\\{\\}\\(\\)\\\n\\\t=]'
>>> print(pattern)
[\ ,;\{\}\(\)\
\ =]
That is, a regex character set containing each of your invalid chars. (The backslash escaping ensures that none of them are interpreted as a regex control character, regardless of which characters you put in invalid_chars
.) If you had specified them as a string in the first place, the ''.join()
would not be required.
You can also compile the pattern (using re.compile()
) if you need to re-use it on multiple words.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.