简体   繁体   中英

How to remove next pandas dataframe row when it's equal to previous row only for some columns

I have created a dataframe called df with this code:

# initialize list of lists
data = {'ID': [1,2,3,4,5,6,7],
        'feature1': [100,32,100,100,100,93,100],
        'feature2': [100,32,100,100,100,93,100],
        'feature3': [100,32,100,100,100,93,100],
        }
 
# Create DataFrame
df = pd.DataFrame(data)

The dataframe looks like this:

print(df)

   ID  feature1  feature2  feature3
0   1       100       100       100
1   2        32        32        32
2   3       100       100       100
3   4       100       100       100
4   5       100       100       100
5   6        93        93        93
6   7       100       100       100

I want to remove the rows in which the values of columns:

  • feature1 and
  • feature2 and
  • feature3 are exactly the same as the previous row. In the example above, I need to remove rows 3 and 4 , so that the resulting dataframe will look like this:

在此处输入图像描述

Filter the feature like columns then calculate difference between previous and current row and check whether the difference is 0 for all the feature columns

df[~df.filter(like='feature').diff().eq(0).all(1)]

   ID  feature1  feature2  feature3
0   1       100       100       100
1   2        32        32        32
2   3       100       100       100
5   6        93        93        93
6   7       100       100       100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM