[英]How to remove next pandas dataframe row when it's equal to previous row only for some columns
I have created a dataframe called df
with this code:我使用以下代码创建了一个名为
df
的数据框:
# initialize list of lists
data = {'ID': [1,2,3,4,5,6,7],
'feature1': [100,32,100,100,100,93,100],
'feature2': [100,32,100,100,100,93,100],
'feature3': [100,32,100,100,100,93,100],
}
# Create DataFrame
df = pd.DataFrame(data)
The dataframe looks like this:数据框如下所示:
print(df)
ID feature1 feature2 feature3
0 1 100 100 100
1 2 32 32 32
2 3 100 100 100
3 4 100 100 100
4 5 100 100 100
5 6 93 93 93
6 7 100 100 100
I want to remove the rows in which the values of columns:我想删除列值所在的行:
feature1
and feature1
和feature2
and feature2
和feature3
are exactly the same as the previous row. feature3
与上一行完全相同。 In the example above, I need to remove rows 3
and 4
, so that the resulting dataframe will look like this:3
行和4
行,以便生成的数据框如下所示:Filter
the feature
like columns then calculate difference between previous and current row and check whether the difference is 0
for all the feature
columns Filter
feature
列,然后计算前一行和当前行之间的差异,并检查所有feature
列的差异是否为0
df[~df.filter(like='feature').diff().eq(0).all(1)]
ID feature1 feature2 feature3
0 1 100 100 100
1 2 32 32 32
2 3 100 100 100
5 6 93 93 93
6 7 100 100 100
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.