简体   繁体   中英

Python - Pandas - .str.contains filter for multiple columns

I am currently using the.str.contains function from the pandas module, to filter cells containing certain text. I have now managed to get the filter to work for 1 columns, however after doing some research and testing i am unable to get this to filter 2 columns.

Example Input Data:

在此处输入图像描述

Syntax Example 1:

These syntaxes work on their own to produce the following output:

test1 = data[data["Date"].str.contains("Tue 02 Feb 2021")]

在此处输入图像描述

test2 = data[data["Agent"].str.contains("NaN", na=True, regex=False)]

在此处输入图像描述

Syntax Example 2

When trying to put these together using |this does not work, but does not return any errors

test3 = data[data["Agent"].str.contains("NaN", na=True, regex=False) | data["Date"].str.contains("Tue 02 Feb 2021")]

You need to put the conditionnal expression into parenthesis:

data[(data["Agent"].str.contains("NaN", na=True, regex=False) | data["Date"].str.contains("Tue 02 Feb 2021"))]

Try this:

test3 = data[np.logical_or(data["Agent"].str.contains("NaN", na=True, regex=False), data["Date"].str.contains("Tue 02 Feb 2021"))]

You can also enclose it in parentheses because the or (|) operator is overloaded for numpy arrays.

If I'm understanding this correctly, you are wanting to filter down your dataframe right? I think you would want & instead of | in that case

>>> test3 = data[data["Agent"].str.contains("NaN", na=True, regex=False) & data["Date"].str.contains("Tue 02 Feb 2021")]
>>> print(test3)
  Agent Description             Date
2  None     example  Tue 02 Feb 2021
4  None     example  Tue 02 Feb 2021
>>> 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM