I am trying to filter out rows in my data frame column 'PRODUCT'
while using str.contains('DE')
. DE
ranges from DE001
up to DE999
.
How do I filter out DE998
and DE999
? I have been trying this code but I can't seem to figure out a way to remove DE998
and DE999
without having to do it manually on another line.
I am using df2[df2['PRODUCT'].str.contains("DE")]
. Can anyone suggest a code for this or a more efficient way to do this? Thank you for answering. Sorry, still a newbie programmer.
You can create 2 masks: one testing the first 2 characters and the other testing the entire string. For the second condition, we can use ~
to indicate a negative condition. Then combine the 2 Boolean masks with the &
operator.
mask1 = df2['PRODUCT'].str[:2] == 'DE'
mask2 = ~df2['PRODUCT'].isin(['DE998', 'DE999'])
res = df2[mask1 & mask2]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.