Imagine I have a dataframe like the one below, where I am recording each animal I see on each day as a new row.
Day Animal
1 Lion
1 Elephant
1 Giraffe
1 Elephant
2 Elephant
2 Rhino
2 Rhino
2 Lion
2 Elephant
I would like to create a new column that contains 1
for the first animal seen on each day (and each time that same animal is seen that day) and contains 2
for the next animal and so on. The result for the example above should look like this:
Day Animal Number
1 Lion 1
1 Elephant 2
1 Giraffe 3
1 Elephant 2
2 Elephant 1
2 Rhino 2
2 Rhino 2
2 Lion 3
2 Elephant 1
Note that this is a simplified example. I am aware that in this above example, one would likely prefer to use a combination of groupby
and count
to count occurrences per day (eg summing the number of occurrences per day pandas ). However, this is not the case in my real world case. I need to number them so I can use those numbers for something else later.
you can use series.factorize
over groupby.transform
df['Number'] = df.groupby("Day")['Animal'].transform(lambda x: x.factorize()[0])+1
print(df)
Day Animal Number
0 1 Lion 1
1 1 Elephant 2
2 1 Giraffe 3
3 1 Elephant 2
4 2 Elephant 1
5 2 Rhino 2
6 2 Rhino 2
7 2 Lion 3
8 2 Elephant 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.