I have a table that looks like this
Date Col0 Col1 Col2
2-18-2019 1 ap sd 23
2-18-2019 2 dh au 88
2-18-2019 3 ap hre 92
2-19-2019 1 sd ap 23
2-19-2019 2 sd ap 78
2-19-2019 3 ap sd 78
2-20-2019 1 ap sd 37
2-20-2019 2 sd ap 29
2-20-2019 3 djd dh 34
2-21-2019 1 eds ed 44
2-21-2019 2 u4r rg 34
2-21-2019 3 ufif ew 23
2-22-2019 1 eds sd 44
2-22-2019 2 u4r rg 34
2-22-2019 3 ap ew 23
I need to filter last row with the key words if they were repeated for several days, so If few days later the key words were repeated i need to include them just like the result table below.
the result I'm looking for should be something like this
Date Col0 Col1 Col2
2-19-2019 3 ap sd 78
2-20-2019 1 ap sd 37
2-20-2019 2 sd ap 29
2-22-2019 1 eds sd 44
2-22-2019 3 ap ew 23
I tried this
df = df[(Col1.str.contains('ap')) | (Col1.str.contains('sd'))]
but this would give me this result
Date Col0 Col1 Col2
2-18-2019 1 ap sd 23
2-19-2019 1 sd ap 23
2-19-2019 2 sd ap 78
2-19-2019 3 ap sd 78
2-20-2019 1 ap sd 37
2-20-2019 2 sd ap 29
2-22-2019 1 eds sd 44
2-22-2019 3 ap ew 23
And this is wrong since it return everything. the difference between the result I have and the desired one below is that if the condition was not met in one day (date column) or more then it shows again I need to repeat the process
Date Col0 Col1 Col2
2-19-2019 3 ap sd 78
2-20-2019 1 ap sd 37
2-20-2019 2 sd ap 29
2-22-2019 1 eds sd 44
2-22-2019 3 ap ew 23
Thanks
IIUC use:
df = df[df.Col1.str.contains('ap|sd')].drop_duplicates('Col1', keep='last')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.