[英]Python dataframe - drop consecutive rows based on a column
I need to remove consecutive rows based on a column value.我需要根据列值删除连续的行。 my dataframe looks like below
我的数据框如下所示
df = pd.DataFrame({
"CustID":
["c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2"],
"saleValue":
[10, 12, 13, 6, 4 , 2, 11, 17, 1,5,8,2,16,13,1,4],
"Status":
[0, 0, 0, 1, 1 ,1, 0, 0, 1,1,1,1,0,0,1,1]
})
dataframe looks like below
CustID saleValue Status
c1 10 0
c1 12 0
c1 13 0
c1 6 1
c1 4 1
c1 2 1
c1 11 0
c1 17 0
c1 1 1
c1 5 1
c2 8 1
c2 2 1
c2 16 0
c2 13 0
c2 1 1
c2 4 1
I need to drop the consecutive rows for each CustID only when the Status is 1 .Can you please let me know the best way to do it仅当状态为 1 时,我才需要删除每个 CustID 的连续行。你能告诉我最好的方法吗
so the output should look like below.
CustID saleValue Status
c1 10 0
c1 12 0
c1 13 0
c1 6 1
c1 11 0
c1 17 0
c1 1 1
c2 8 1
c2 16 0
c2 13 0
c2 1 1
Create a Boolean mask for the entire DataFrame.为整个 DataFrame 创建一个布尔掩码。
Given the DataFrame is already grouped by ID, find rows where the value is 1, the previous row is also 1, and where the ID is the same as the ID on the previous row.给定DataFrame已经按ID分组,查找值为1,前一行也是1,且ID与前一行ID相同的行。 These are the rows to drop, so keep the rest.
这些是要删除的行,所以保留其余的行。
to_drop = (df['Status'].eq(1) & df['Status'].shift().eq(1) # Consecutive 1s
& df['CustID'].eq(df['CustID'].shift())) # Within same ID
df[~to_drop]
CustID saleValue Status
0 c1 10 0
1 c1 12 0
2 c1 13 0
3 c1 6 1
6 c1 11 0
7 c1 17 0
8 c1 1 1
10 c2 8 1
12 c2 16 0
13 c2 13 0
14 c2 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.