[英]Removes a word from a dataframe column based on a specific word list .txt
I have a specific dataframe column like the one below.我有一个特定的 dataframe 列,如下所示。
Index Positif
1 keren banget mobilnya
2 bagus kendaraannya keren deh
3 mobilnya baik jalannya
4 suara mesinnya indah dan baik
and I have a list of words based on.txt contains.我有一个基于 .txt 包含的单词列表。
keren
bagus
baik
indah
I want the columns only to contain the specific words that are based on the.txt file and desired output.我希望列仅包含基于 .txt 文件和所需 output 的特定单词。
Index Positif
1 keren
2 bagus keren
3 baik
4 indah baik
Try:尝试:
words = ["keren", "bagus", "baik", "indah"]
df["Positif"] = df.index.map(
df["Positif"]
.str.extractall("(" + "|".join(words) + ")")
.groupby(level=0)
.agg(" ".join)[0]
)
print(df)
Prints:印刷:
Index Positif
0 1 keren
1 2 bagus keren
2 3 baik
3 4 indah baik
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.