繁体   English   中英

从文本列表中删除一组单词

[英]remove a set of words from a list of text

我正在尝试从文本列表中删除单词列表,但 output 似乎没有被删除。 请帮助我从列表中删除文本

text_list = ['apple is good for health', 'orange and grapes are tasty']
words = ['apple','orange','grapes']
words_format = r'\b(?:{})\b'.format('|',join(words))
remove_words = lambda y: y.replace(words_format,' ')

new_text = list(map(remove_words, text_list))


预期 output:

['is good for health', 'and are tasty']

我只是拆分输入,过滤掉无效的单词,然后再次加入结果:

[" ".join([word for word in text.split(" ") if word not in words]) for text in text_list]

str.replace()不识别正则表达式。 您可以改用re.sub()

import re

text_list = ['apple is good for health', 'orange and grapes are tasty']
words = ['apple', 'orange', 'grapes']
words_format = r'\b(?:{})\b'.format('|'.join(words))
remove_words = lambda y: re.sub(words_format, ' ', y)

new_text = list(map(remove_words, text_list))
print(new_text)

Output:

['  is good for health', '  and   are tasty']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM