如何从包含特定列中的特定字符串（多个）的 Pandas 数据框中删除行？

Question

I have been trying this creating multiple dataframes to create multiple strings, but I am not able to remove strings more than 2 only thing is i wanted multiple strings to be removed.我一直在尝试创建多个数据框来创建多个字符串，但是我无法删除超过 2 个的字符串，唯一的问题是我想要删除多个字符串。

data3 = data[~data.column.str.contains("remove words")]
data3 = data3[~data3.column.str.contains("remove me")]

data3.count

I have tried this but no good.我试过这个，但不好。

df = df[~df.column.isin(['remove words'])]

or或者

df = df[~df.column.isin(['remove words', 'remove me'])]

Answer 1

You simply need to add loc .您只需要添加loc 。 When a boolean mask is applied to a dataframe/series, only the explicit loc notation will do the trick.当布尔掩码应用于数据帧/系列时，只有显式的 loc 符号才能起作用。

df.loc[~df.column.isin(['remove words', 'remove me'])]

Answer 2

I think you were on the right path.我认为你走在正确的道路上。

Let's define a toy dataframe:让我们定义一个玩具数据框：

>>> df = pd.DataFrame([("i have a car", 2), 
    ("cows make milk", 3), 
    ("try this remove me stuff", 5), 
    ("please remove words", 51)], 
    columns=["text", "number"])

And here you go:你去吧：

>>> words_to_avoid = ["remove me", "remove words"]
>>> df[df.text.apply(
        lambda txt: not any([word_to_avoid in txt for word_to_avoid in words_to_avoid])
    )]

    text    number
0   car       2
1   cow       3

Answer 3

试试这个方法：

df2 = df1[~df1.column.str.contains('remove words|remove me', regex=True)]

如何从包含特定列中的特定字符串（多个）的 Pandas 数据框中删除行？

问题描述

3 个解决方案

解决方案1
0 2020-02-03 15:20:11

解决方案2
0 2020-02-03 15:22:28

解决方案3
0 2020-12-21 08:51:37

如何从包含特定列中的特定字符串（多个）的 Pandas 数据框中删除行？

问题描述

3 个解决方案

解决方案1 0 2020-02-03 15:20:11

解决方案2 0 2020-02-03 15:22:28

解决方案3 0 2020-12-21 08:51:37

解决方案1
0 2020-02-03 15:20:11

解决方案2
0 2020-02-03 15:22:28

解决方案3
0 2020-12-21 08:51:37