简体   繁体   English

Python 过滤带条件的重复行

[英]Python filter repeated rows with condition

I have a table that looks like this我有一张看起来像这样的桌子

Date           Col0          Col1            Col2   
2-18-2019       1            ap sd            23
2-18-2019       2            dh au            88
2-18-2019       3            ap hre           92
2-19-2019       1            sd ap            23
2-19-2019       2            sd ap            78
2-19-2019       3            ap sd            78
2-20-2019       1            ap sd            37
2-20-2019       2            sd ap            29
2-20-2019       3            djd dh           34
2-21-2019       1            eds ed           44
2-21-2019       2            u4r rg           34
2-21-2019       3            ufif ew          23
2-22-2019       1            eds sd           44
2-22-2019       2            u4r rg           34
2-22-2019       3            ap ew            23

I need to filter last row with the key words if they were repeated for several days, so If few days later the key words were repeated i need to include them just like the result table below.如果关键字重复了几天,我需要用关键字过滤最后一行,所以如果几天后重复关键字,我需要像下面的结果表一样包含它们。

the result I'm looking for should be something like this我正在寻找的结果应该是这样的

Date           Col0          Col1            Col2   
2-19-2019       3            ap sd            78
2-20-2019       1            ap sd            37
2-20-2019       2            sd ap            29
2-22-2019       1            eds sd           44
2-22-2019       3            ap ew            23

I tried this我试过这个

df = df[(Col1.str.contains('ap')) | (Col1.str.contains('sd'))]

but this would give me this result但这会给我这个结果

Date           Col0          Col1            Col2   
2-18-2019       1            ap sd            23
2-19-2019       1            sd ap            23
2-19-2019       2            sd ap            78
2-19-2019       3            ap sd            78
2-20-2019       1            ap sd            37
2-20-2019       2            sd ap            29
2-22-2019       1            eds sd           44
2-22-2019       3            ap ew            23

And this is wrong since it return everything.这是错误的,因为它会返回所有内容。 the difference between the result I have and the desired one below is that if the condition was not met in one day (date column) or more then it shows again I need to repeat the process我得到的结果和下面想要的结果之间的区别是,如果在一天(日期列)或更多天(日期列)内没有满足条件,那么它再次显示我需要重复这个过程

Date           Col0          Col1            Col2   
2-19-2019       3            ap sd            78
2-20-2019       1            ap sd            37
2-20-2019       2            sd ap            29
2-22-2019       1            eds sd           44
2-22-2019       3            ap ew            23

Thanks谢谢

IIUC use: IIUC 用途:

df = df[df.Col1.str.contains('ap|sd')].drop_duplicates('Col1', keep='last')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM