简体   繁体   中英

How to delete specific rows in pandas dataframe if a condition is met

I have a pandas dataframe with few thousand rows and only one column. The structure of the content is as follows:

  |   0
0 | Score 1
1 | Date 1
2 | Group 1
3 | Score 1
4 | Score 2
5 | Date 2
6 | Group 2
7 | Score 2
8 | Score 3
9 | Date 3
10| Group 3
11| ...
12| ...
13| Score (n-1)
14| Score n
15| Date n
16| Group n

I need to delete all rows with index i if "Score" in row(i) and "Score" in row(i+1). Any suggestion on how to achieve this?

The expected output is as follows:

  |   0
0 | Score 1
1 | Date 1
2 | Group 1
3 | Score 2
4 | Date 2
5 | Group 2
6 | Score 3
7 | Date 3
8 | Group 3
9 | ...
10| ...
11| Score n
12| Date n
13| Group n

I need to delete all rows with index i if "Score" in row(i) and "Score" in row(i+1). Any suggestion on how to achieve this?

Given

>>> df
         0
0  Score 1
1   Date 1
2  Group 1
3  Score 1
4  Score 2
5   Date 2
6  Group 2
7  Score 2
8  Score 3
9   Date 3

you can use

>>> mask = df.assign(shift=df[0].shift(-1)).apply(lambda s: s.str.contains('Score')).all(1)
>>> df[~mask].reset_index(drop=True)
         0
0  Score 1
1   Date 1
2  Group 1
3  Score 2
4   Date 2
5  Group 2
6  Score 3
7   Date 3

Although if I were you I would use fix the format of the data first as the commenters already pointed out.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM