简体   繁体   English

从 Pandas dataframe 的列中删除列表中的单词

[英]Removing words in a list from column in Pandas dataframe

I have a dataframe of various wines.我有各种葡萄酒的 dataframe。 I am trying to remove all punctuation, all words containing 4 or fewer characters, as well as the words flavors, aromas, finish, and drink from the string values contained in the 'description' column.我正在尝试从“描述”列中包含的字符串值中删除所有标点符号、所有包含 4 个或更少字符的单词,以及单词风味、香气、完成和饮料。 My code does not appear to be working and I have also tried various permutations of this to no avail.我的代码似乎没有工作,我也尝试了各种排列但无济于事。

remove_list = ['[^\w\s]', '[\b(\w{1,4})\b]', 'flavors', 'aromas', 'finish', 'drink']

df11['description'].str.replace('|'.join(remove_list), '', regex=True)

try:尝试:

remove_list = [r'[^\w\s]', r'\b\w{1,3}\b', 'flavors', 'aromas', 'finish', 'drink']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM