简体   繁体   中英

Python Pandas Sequentially Count Up Occurrences For Unique Pairs in Multiindex

I have a dataframe logging exercises completed, with a two column multiindex: Day and Person. Each day, each person logs which exercises they did (if they exercised). I would like to add another column which sequentially counts the entries made into this log, as shown below. So for each unique pair of day and person, count up by 1.

Day   Person   Exercise    EntryNumber
1     Joe      Curls                 1
1     Joe      Squats                1
1     Sandy    Sprints               2
1     Sandy    Bench                 2
2     Joe      Curls                 3
2     Sandy    Squats                4
3     Bob      Pushups               5

Here is the code to generate that above dataframe.

import pandas as pd
df = pd.DataFrame({'Day':[1,1,1,1,2,2,3], 
                   'Person':['Joe','Joe','Sandy','Sandy','Joe','Sandy','Bob'], 
                   'Exercise':['Curls','Squats','Sprints','Bench','Curls','Squats','Pushups']})
df = df.set_index(['Day','Person'])

How would I go about creating the EntryNumber column? I've tried all manner of groupby and cumcount but have not yet figured it out.

Thanks!

May be you can try with groupby followed by ngroup() :

#Generating df from above
import pandas as pd
df = pd.DataFrame({'Day':[1,1,1,1,2,2,3], 
                   'Person':['Joe','Joe','Sandy','Sandy','Joe','Sandy','Bob'], 
                   'Exercise':['Curls','Squats','Sprints','Bench','Curls','Squats','Pushups']})
df = df.set_index(['Day','Person'])

# applying reset index and ngroup
df.reset_index(inplace=True)
df['Entry Number'] = df.groupby(['Day','Person']).ngroup() +1
df

Result:

    Day Person  Exercise    Entry Number
0   1   Joe      Curls      1
1   1   Joe      Squats     1
2   1   Sandy    Sprints    2
3   1   Sandy    Bench      2
4   2   Joe      Curls      3
5   2   Sandy    Squats     4
6   3   Bob      Pushups    5

Another way is factorize by index without having to group:

df['EntryNumber'] = df.index.factorize()[0]+1
#df = df.reset_index() -> if you want to reset theindex

print(df)
           Exercise  EntryNumber
Day Person                      
1   Joe       Curls            1
    Joe      Squats            1
    Sandy   Sprints            2
    Sandy     Bench            2
2   Joe       Curls            3
    Sandy    Squats            4
3   Bob     Pushups            5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM