I'm trying to match a bunch of names from a list to the names in one of the columns of a Pandas DataFrame. A small part of the DataFrame is shown below:
The values in the columns "Object ID" had some whitespace which I stripped using the line:
df["Object ID"] = df["Object ID"].str.strip()
I am searching the column "Object ID" using the following line:
df[df["Object ID"].str.contains('EM* LkHA 115') == True]
The above line is returning an empty dataframe eventhough 'EM* LkHA 115' exists in the dataframe as shown below:
Any idea what I could be doing wrong? I would be happy to provide any further information if it would be of help.
Thanks in advance !
You have to escape the '*' char.
df[df["Object ID"].str.contains('EM\* LkHA 115')]
also you don't need the == True
As @MustafaAydın says in the comment below you can use the regex lib to do this dynamically.
import re
df[df["Object ID"].str.contains(re.escape('EM* LkHA 115'))]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.