简体   繁体   中英

Filter similar rows based on timestamp as index in a dataframe

I have a dataframe with timestamp as index. I want to filter out the rows with similar timestamp and print it out. For example,

                        Value_1  Value_2
timestamp                                           
2022-03-22 17:31:02     23        30    
2022-03-22 17:31:02     25        40
2022-03-22 17:31:04     24        0.2   
2022-03-22 17:31:05     21        90    
2022-03-22 17:31:05     20        0.1

I want to filter out the rows with similar timestamps as shown below.

                        Value_1  Value_2
timestamp                                           
2022-03-22 17:31:02     23        30    
2022-03-22 17:31:02     25        40
2022-03-22 17:31:05     21        90    
2022-03-22 17:31:05     20        0.1

I tried using melt and groupby . It does not give me the desired results. Could someone help me with it?

Thanks.

We can use index.duplicated with parameter keep=False (meaning if there is a duplicate value then mark all occurrence of that duplicate value to True )

df[df.index.duplicated(keep=False)]

                     Value_1  Value_2
timestamp                            
2022-03-22 17:31:02       23     30.0
2022-03-22 17:31:02       25     40.0
2022-03-22 17:31:05       21     90.0
2022-03-22 17:31:05       20      0.1

You can also use groupby + filter :

filtered_df = df.groupby(level=0).filter(lambda x: len(x) > 1)

Output:

>>> filtered_df
                     Value_1  Value_2
timestamp                            
2022-03-22 17:31:02       23     30.0
2022-03-22 17:31:02       25     40.0
2022-03-22 17:31:05       21     90.0
2022-03-22 17:31:05       20      0.1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM