简体   繁体   中英

Number new occurrences per day in pandas dataframe (not count or sum)

Imagine I have a dataframe like the one below, where I am recording each animal I see on each day as a new row.

Day     Animal
1       Lion
1       Elephant
1       Giraffe
1       Elephant
2       Elephant
2       Rhino
2       Rhino
2       Lion
2       Elephant

I would like to create a new column that contains 1 for the first animal seen on each day (and each time that same animal is seen that day) and contains 2 for the next animal and so on. The result for the example above should look like this:

Day     Animal      Number
1       Lion        1
1       Elephant    2
1       Giraffe     3
1       Elephant    2 
2       Elephant    1
2       Rhino       2
2       Rhino       2
2       Lion        3
2       Elephant    1

Note that this is a simplified example. I am aware that in this above example, one would likely prefer to use a combination of groupby and count to count occurrences per day (eg summing the number of occurrences per day pandas ). However, this is not the case in my real world case. I need to number them so I can use those numbers for something else later.

you can use series.factorize over groupby.transform

df['Number'] = df.groupby("Day")['Animal'].transform(lambda x: x.factorize()[0])+1
print(df)

   Day    Animal  Number
0    1      Lion       1
1    1  Elephant       2
2    1   Giraffe       3
3    1  Elephant       2
4    2  Elephant       1
5    2     Rhino       2
6    2     Rhino       2
7    2      Lion       3
8    2  Elephant       1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM