I have a pandas dataframe that looks like :
>>> df
product desc
0 ABCD desc1
1 ABCD1,XYZ desc2
2 ABCD1H desc3
3 ABCD1 desc4
4 ABCD1H,LMN desc5
I want to filter out rows that have products ABCD1
or ABCD1 followed by any other product ID
but not ABCD1H
. How to filter out such rows. In the above example , I want the output as :
>>> df
product desc
1 ABCD1,XYZ desc2
3 ABCD1 desc4
This is what I have tried so far but that does not work .
df2 = df.loc[df['product'].str.contains('ABCD1')]
It also includes ABCD1H
in its results, i don't want that to happen.
Use regex "\\b" is word break:
df[df['product'].str.contains(r'ABCD1\b')]
Output:
product desc
1 ABCD1,XYZ desc2
3 ABCD1 desc4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.