I'm trying to remove rows in a pandas dataframe, in a way that everything between two specific values (eg, start
and end
) is deleted, including the two values. These values can repeat, as in:
c1 | c2 |
---|---|
1 | 1 |
2 | start |
3 | 1 |
4 | 0 |
5 | end |
6 | 1 |
7 | start |
8 | 1 |
9 | 0 |
10 | end |
11 | 1 |
So the desired output would be:
c1 | c2 |
---|---|
1 | 1 |
6 | 1 |
11 | 1 |
I recreated a similar dataframe
like yours. This is not an efficient way to do it, but it can work.
df1
:
c1 c2
0 1 1
1 2 start
2 3 3
3 4 end
4 5 5
5 6 start
6 7 end
7 8 0
code:
import pandas as pd
import copy
df = pd.DataFrame({'c1': [1, 2, 3, 4,5,6,7,8], 'c2': ['1', 'start', '3', 'end','5','start','end',0]})
df2 = copy.copy(df)
flag = False
for i, j in df.iterrows():
if j['c2'] == 'start':
flag = True
df2 = df2.drop(df.index[[i]])
elif j['c2'] =='end':
flag = False
df2 = df2.drop(df.index[[i]])
elif flag:
df2 = df2.drop(df.index[[i]])
output df2
:
c1 c2
0 1 1
4 5 5
7 8 0
You can use masks
mask1 = df.c2.shift(-1) == "start"
mask2 = df.c2.shift(1) == "end"
newDf = (df.loc[mask1 | mask2]).reset_index(drop=True)
Output
c1 c2
0 1 1
1 5 5
2 8 0
Starting from the answer from Tamil above, this is how I managed to implement it in my dataframe. It should be more efficient since it uses itertuples and not iterrows.
df = pd.DataFrame({'c1': [1, 2, 3, 4,5,6,7,8], 'c2': ['1', 'start', '3', 'end','5','start','end',0]})
df2 = copy.copy(df)
flag = False
list_a = []
for j in df.itertuples():
if j.c2 == 'start':
flag = True
list_a.append((j))
elif j.c2 =='end':
flag = False
list_a.append((j))
elif flag:
list_a.append((j))
list_a = tuple(list_a)
to_remove_df = pd.DataFrame(list_a, columns=['index','c1','c2'])
to_remove_df = to_remove_df["c2"]
removed_df = pd.merge(df, to_remove_df, on=["c2"], how="outer", indicator=True).query('_merge != "both"').drop('_merge', 1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.