简体   繁体   中英

How to enumerate rows in pandas with nonunique values in groups

I am working with expeditions geodata. Could you help with enumeration of stations and records for the same station depending on expedition ID (ID), date (Date), latitude (Lat), longitude (Lon) and some value (Val, it is not reasonable for enumeration)? Assume that station is a group of rows with the same (ID,Date,Lat,Lon), expedition is a group of rows with the same ID. Dataframe is sorted by 4 columns as in example.

Dataset and required columns

import pandas as pd
data = [[1,'2017/10/10',70.1,30.4,10],\
    [1,'2017/10/10',70.1,31.4,20],\
    [1,'2017/10/10',70.1,31.4,10],\
    [1,'2017/10/10',70.1,31.4,10],\
    [1,'2017/10/12',70.1,31.4,20],\
    [2,'2017/12/10',70.1,30.4,20],\
    [2,'2017/12/10',70.1,31.4,20]];

df = pd.DataFrame(data,columns=['ID','Date','Lat','Lon','Val']);

Additional (I need it, St for station number and Rec for record number within the same station data; output for example above):

df['St'] = [1,2,2,2,3,1,2];
df['Rec'] = [1,1,2,3,1,1,1];
print(df)

I tried and used groupby/cumcount/agg/factorize but have not solved my problem.

Any help! Thanks!

To create 'St' , you can use groupby on 'ID' and then check when any of the columns 'Date','Lat','Lon' is different than the previous one using shift , and use cumsum to get the numbers you want, such as:

df['St'] = (df.groupby(['ID'])
              .apply(lambda x: (x[['Date','Lat','Lon']].shift() != x[['Date','Lat','Lon']])
                               .any(axis=1).cumsum())).values

And to create 'Rec' , you also need groupby but on all columns 'ID','Date','Lat','Lon' and then use cumcount and add such as:

df['Rec'] = df.groupby(['ID','Date','Lat','Lon']).cumcount().add(1)

and you get:

   ID        Date   Lat   Lon  Val  St  Rec
0   1  2017/10/10  70.1  30.4   10   1    1
1   1  2017/10/10  70.1  31.4   20   2    1
2   1  2017/10/10  70.1  31.4   10   2    2
3   1  2017/10/10  70.1  31.4   10   2    3
4   1  2017/10/12  70.1  31.4   20   3    1
5   2  2017/12/10  70.1  30.4   20   1    1
6   2  2017/12/10  70.1  31.4   20   2    1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM