![](/img/trans.png)
[英]How to remove set of words and their variants (or inflections) from a text file?
[英]remove a set of words from a list of text
我正在尝试从文本列表中删除单词列表,但 output 似乎没有被删除。 请帮助我从列表中删除文本
text_list = ['apple is good for health', 'orange and grapes are tasty']
words = ['apple','orange','grapes']
words_format = r'\b(?:{})\b'.format('|',join(words))
remove_words = lambda y: y.replace(words_format,' ')
new_text = list(map(remove_words, text_list))
预期 output:
['is good for health', 'and are tasty']
我只是拆分输入,过滤掉无效的单词,然后再次加入结果:
[" ".join([word for word in text.split(" ") if word not in words]) for text in text_list]
str.replace()
不识别正则表达式。 您可以改用re.sub()
。
import re
text_list = ['apple is good for health', 'orange and grapes are tasty']
words = ['apple', 'orange', 'grapes']
words_format = r'\b(?:{})\b'.format('|'.join(words))
remove_words = lambda y: re.sub(words_format, ' ', y)
new_text = list(map(remove_words, text_list))
print(new_text)
Output:
[' is good for health', ' and are tasty']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.