I'm struggling to find the cumulative sum of the income value by day in this example df. I would like to see the cumulative value for each day of the week add up, so Monday + Monday. Tuesday + Tuesday. etc.
Example df:
df = pd.DataFrame({'day': ['Mon','Tue','Wed', 'Mon', 'Tue', 'Wed'],
'date': ['2002-01-02', '2002-01-03', '2002-01-04', '2002-01-08', '2002-01-09', '2002-01-10'],
'income': [40, 60, 40, 100, 55, 32]
})
I would like to add a cumsum() column to this df, that only adds the income for each day.
My attempt that just adds every row cumulatively in a table of plot:
df['cum_income_by_day'] = df['income'].cumsum()
Monday = df[df['day']] == 'Monday'
sn.lineplot(data = Monday, x="Date", y="cum_income_by_day")
Attempt 2 throws a 'setting with copy' warning that is valid, my results aren't accurate - not sure what is happening but i can see the cumulation of the first few values is wrong.
Monday = df[df['day'] == 'Monday']
df['cum_income_by_day'] = Monday['income'].cumsum()
I thought maybe the answer is in groupby given i want to do this for every day, not just Mondays, but i just get one cumulative value. I tried a for loop (I'm new, i'm still learning) and couldn't crack it. Any advice much appreciated.
Ideal output looks like this:
If you like my question upvote it so i have enough points to upvote your answers.
You have to groupby
first!
>>> df['cumsum_income'] = df.groupby('day')['income'].cumsum()
day date income cumsum_income
0 Mon 2002-01-02 40 40
1 Tue 2002-01-03 60 60
2 Wed 2002-01-04 40 40
3 Mon 2002-01-08 100 140
4 Tue 2002-01-09 55 115
5 Wed 2002-01-10 32 72
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.