[英]Remove list of phrases from string
I have an array of phrases: 我有一系列的短语:
bannedWords = ['hi', 'hi you', 'hello', 'and you']
I want to take a sentence like "hi, how are tim and you doing" and get this: 我想说一句“嗨,蒂姆,你好吗”,并得到以下信息:
", how are tim doing"
Exact case matching is OK - sorry, should have clarified. 精确的大小写匹配是可以的-抱歉,应该弄清楚。
Since you want to remove extra spaces as well, the regex below should work better: 由于您也想删除多余的空格,因此下面的正则表达式应该可以更好地工作:
s = "Hi, How are Tim and you doing"
bannedWords = ['hi', 'hi you', 'hello', 'and you']
for i in bannedWords:
s = re.sub(i + "\s*", '', s, flags = re.I)
print s
# ', How are Tim doing'
You can use re.sub
with a flag to do this in a case insensitive manner. 您可以将
re.sub
与标志一起使用,以不区分大小写的方式进行。
import re
bannedWords = ['hi', 'hi you', 'hello', 'and you']
sentence = "Hi, how are Tim and you doing"
new_sentence = re.sub('|'.join(bannedWords) + r'\s+', '', sentence, flags=re.I)
# new_sentence: ", how are Tim doing"
With regex you can join words you want to remove with |. 使用正则表达式,您可以使用|将要删除的单词连接起来。 We also want to remove any multiple blankspace with one blankspace.
我们还想删除带有一个空格的任何多个空格。 This ensures we only do two operations.
这样可以确保我们仅执行两项操作。
import re
def remove_banned(s,words):
pattern = '|'.join(words)
s = re.sub(pattern, '', s, flags = re.I) # remove words
s = re.sub('\s+', ' ', s, flags = re.I) # remove extra blank space'
return s
bannedWords = ['hi', 'hi you', 'hello', 'and you']
s = "Hi, How are Tim and you doing"
print(remove_banned(s,bannedWords))
Returns: 返回值:
, How are Tim doing
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.