[英]How to drop rows from pandas data frame that contains a particular string(multiple) in a particular column?
I have been trying this creating multiple dataframes to create multiple strings, but I am not able to remove strings more than 2 only thing is i wanted multiple strings to be removed.我一直在尝试创建多个数据框来创建多个字符串,但是我无法删除超过 2 个的字符串,唯一的问题是我想要删除多个字符串。
data3 = data[~data.column.str.contains("remove words")]
data3 = data3[~data3.column.str.contains("remove me")]
data3.count
I have tried this but no good.我试过这个,但不好。
df = df[~df.column.isin(['remove words'])]
or或者
df = df[~df.column.isin(['remove words', 'remove me'])]
You simply need to add loc
.您只需要添加
loc
。 When a boolean mask is applied to a dataframe/series, only the explicit loc notation will do the trick.当布尔掩码应用于数据帧/系列时,只有显式的 loc 符号才能起作用。
df.loc[~df.column.isin(['remove words', 'remove me'])]
I think you were on the right path.我认为你走在正确的道路上。
Let's define a toy dataframe:让我们定义一个玩具数据框:
>>> df = pd.DataFrame([("i have a car", 2),
("cows make milk", 3),
("try this remove me stuff", 5),
("please remove words", 51)],
columns=["text", "number"])
And here you go:你去吧:
>>> words_to_avoid = ["remove me", "remove words"]
>>> df[df.text.apply(
lambda txt: not any([word_to_avoid in txt for word_to_avoid in words_to_avoid])
)]
text number
0 car 2
1 cow 3
试试这个方法:
df2 = df1[~df1.column.str.contains('remove words|remove me', regex=True)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.