简体   繁体   中英

Pandas: Keep only first occurance of value in group of consecutive values

I have a dataframe that looks like the following (actually, this is the abstracted result of a calculation):

import pandas as pd


data = {"A":[i for i in range(10)]}
index = [1, 3, 4, 5, 9, 10, 12, 13, 15, 20]
df = pd.DataFrame(index=index, data=data)
print(df)

yields:

A
1   0
3   1
4   2
5   3
9   4
10  5
12  6
13  7
15  8
20  9

Now I want to filter the index values to only show the first value in a group of consecutive values eg the following result:

A
1   0
3   1
9   4
12  6
15  8
20  9

Any hints on how to achieve this efficiently?

Use Series.diff which is not implemented for Index , so convert to Series and compre for not equal 1 :

df = df[df.index.to_series().diff().ne(1)]
print (df)
    A
1   0
3   1
9   4
12  6
15  8
20  9

Try this one:

import numpy as np

df.iloc[np.unique(np.array(index)-np.arange(len(index)), return_index=True)[1]]

Try this: df.groupby('A').index.first().reset_index()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM