I need to remove consecutive rows based on a column value. my dataframe looks like below
df = pd.DataFrame({
"CustID":
["c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2"],
"saleValue":
[10, 12, 13, 6, 4 , 2, 11, 17, 1,5,8,2,16,13,1,4],
"Status":
[0, 0, 0, 1, 1 ,1, 0, 0, 1,1,1,1,0,0,1,1]
})
dataframe looks like below
CustID saleValue Status
c1 10 0
c1 12 0
c1 13 0
c1 6 1
c1 4 1
c1 2 1
c1 11 0
c1 17 0
c1 1 1
c1 5 1
c2 8 1
c2 2 1
c2 16 0
c2 13 0
c2 1 1
c2 4 1
I need to drop the consecutive rows for each CustID only when the Status is 1 .Can you please let me know the best way to do it
so the output should look like below.
CustID saleValue Status
c1 10 0
c1 12 0
c1 13 0
c1 6 1
c1 11 0
c1 17 0
c1 1 1
c2 8 1
c2 16 0
c2 13 0
c2 1 1
Create a Boolean mask for the entire DataFrame.
Given the DataFrame is already grouped by ID, find rows where the value is 1, the previous row is also 1, and where the ID is the same as the ID on the previous row. These are the rows to drop, so keep the rest.
to_drop = (df['Status'].eq(1) & df['Status'].shift().eq(1) # Consecutive 1s
& df['CustID'].eq(df['CustID'].shift())) # Within same ID
df[~to_drop]
CustID saleValue Status
0 c1 10 0
1 c1 12 0
2 c1 13 0
3 c1 6 1
6 c1 11 0
7 c1 17 0
8 c1 1 1
10 c2 8 1
12 c2 16 0
13 c2 13 0
14 c2 1 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.