简体   繁体   中英

Filtering data frame using str.contains('') but with an exception

I am trying to filter out rows in my data frame column 'PRODUCT' while using str.contains('DE') . DE ranges from DE001 up to DE999 .

How do I filter out DE998 and DE999 ? I have been trying this code but I can't seem to figure out a way to remove DE998 and DE999 without having to do it manually on another line.

I am using df2[df2['PRODUCT'].str.contains("DE")] . Can anyone suggest a code for this or a more efficient way to do this? Thank you for answering. Sorry, still a newbie programmer.

You can create 2 masks: one testing the first 2 characters and the other testing the entire string. For the second condition, we can use ~ to indicate a negative condition. Then combine the 2 Boolean masks with the & operator.

mask1 = df2['PRODUCT'].str[:2] == 'DE'
mask2 = ~df2['PRODUCT'].isin(['DE998', 'DE999'])

res = df2[mask1 & mask2]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM