I have a dataframe with 3 columns in which the first column is a categorical variable with a person name, the second column is the date and the third column are the cumulative ocurrences of a problem. I would like to generate a new column with the ocurrences by day per person.
**Name Date Cumulative**
John 01-01-2020 0
John 02-01-2020 5
John 03-01-2020 10
John 04-01-2020 12
Peter 01-01-2020 0
Peter 02-01-2020 3
Peter 03-01-2020 7
Peter 04-01-2020 10
James 01-01-2020 0
James 02-01-2020 10
James 03-01-2020 14
James 04-01-2020 18
Kirk 01-01-2020 0
Kirk 02-01-2020 12
Kirk 03-01-2020 12
Kirk 04-01-2020 15
Rob 01-01-2020 0
Rob 02-01-2020 11
Rob 03-01-2020 18
Rob 04-01-2020 23
If I use df['By Day'] = df.Cumulative.diff() the result is good but in the first ocurrence of each person it will give me the negative number instead of 0 (because it subtracts the previous number to the 0). It would give me as follows:
Name Date Cumulative By Day
John 01-01-2020 0 0
John 01-02-2020 0 0
John 03-01-2020 5 5
John 04-01-2020 10 5
John 05-01-2020 12 2
Peter 01-01-2020 0 -12
Peter 02-01-2020 0 0
Peter 03-01-2020 3 3
Peter 04-01-2020 7 4
Peter 04-01-2020 10 3
James 01-01-2020 0 -10
James 02-01-2020 0 0
James 03-01-2020 10 10
James 04-01-2020 14 4
James 04-01-2020 18 4
Kirk 01-01-2020 0 -18
Kirk 02-01-2020 0 0
Kirk 03-01-2020 12 12
Kirk 04-01-2020 15 3
Kirk 04-01-2020 19 4
Rob 01-01-2020 5 -14
Rob 02-01-2020 11 6
Rob 03-01-2020 18 7
Rob 04-01-2020 23 5
Rob 04-01-2020 27 4
I would like to do the difference by each name so that it starts from 0 every time the person is not the same. I've thought about using an iteration by name but it will do it 5 times for each entry. For example I would want, for Rob, 0 6 7 5 4 instead of starting with -14 (the previous 19 from Kirk -5 from Rob's first entry)
You should first use groupby
function on the Name
column to apply the diff
function separately over every person. Then you can use fillna(0)
to replace NaN
values (which will exist in the first row of every person) with 0:
df["By Day"] = df.groupby("Name").Comulative.diff().fillna(0)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.