简体   繁体   中英

pandas.Series.str.contains() is not finding a string which exists in the Series

I'm trying to match a bunch of names from a list to the names in one of the columns of a Pandas DataFrame. A small part of the DataFrame is shown below:

样本值-pandas-df

The values in the columns "Object ID" had some whitespace which I stripped using the line:

df["Object ID"] = df["Object ID"].str.strip()

I am searching the column "Object ID" using the following line:

df[df["Object ID"].str.contains('EM* LkHA 115') == True]

The above line is returning an empty dataframe eventhough 'EM* LkHA 115' exists in the dataframe as shown below:

值存在于 df

Any idea what I could be doing wrong? I would be happy to provide any further information if it would be of help.

Thanks in advance !

You have to escape the '*' char.

df[df["Object ID"].str.contains('EM\* LkHA 115')]

also you don't need the == True

As @MustafaAydın says in the comment below you can use the regex lib to do this dynamically.

import re

df[df["Object ID"].str.contains(re.escape('EM* LkHA 115'))]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM