Pandas groupby cumulative sum and month

Question

I need to sort my df by month with the cumulative sum for each user (about 5 or 6). Each line is a different log entry by the user, so users may have multiple entries on the same day.

        USER        DATE
1      user1  2021-04-15
4      user5  2021-04-15
5      user3  2021-04-15
6      user1  2021-04-15
14     user2  2021-04-16
...      ...         ...
2227   user4  2021-12-30
2228   user5  2021-12-30
2229   user3  2021-12-30
2230   user2  2021-12-30
2231   user1  2021-12-30

I would like to get something like this

         MONTH    USER  CUMSUM
1      2021-04   user1       3
2      2021-04   user2       5
3      2021-04   user3       2
4      2021-04   user4       0
5      2021-04   user5       1
...        ...     ...     ...
n      2021-12   user1     232
n+1    2021-12   user2     124
n+2    2021-12   user3     152
n+3    2021-12   user4     312
n+4    2021-12   user5     218

The objective is to later graph the cumulative sum by month for each user. I have a code that is already working but had to iterate on the df and count each entry for each month on a dict. Probably not the most efficient way. I tried using cumsum and groupby but so far without success.

Answer 1

You can use pandas Grouper or more typically written pd.Grouper for the month, but you have to set an index if you don't already have one.

df.set_index('DATE').groupby([pd.Grouper(freq = 'M'),'USER']).sum()

Answer 2

df['MONTH'] = pd.to_datetime(df['DATE'], format='%Y-%m-%d') # getting DATE to datetime
df['MONTH'] = df['MONTH'].apply(lambda x: x.strftime("%Y-%m")) # applying your format
df['count'] = 1 # adding a count column for cumsum()
df_try = df.groupby(['USER', 'MONTH']).sum().groupby(level=0).cumsum() # groupby and cumsum

Answer 3

how about this one-liner:

df.groupby([pd.Grouper(key='DATE', freq='M'), 'USER'])['USER'].count().groupby(['USER']).cumsum()

Pandas groupby cumulative sum and month

Question

3 answers

solution1
0 2022-01-07 17:26:24

solution2
0 2022-01-07 17:40:36

solution3
0 ACCPTED 2022-01-07 17:44:54

Pandas groupby cumulative sum and month

Question

3 answers

solution1 0 2022-01-07 17:26:24

solution2 0 2022-01-07 17:40:36

solution3 0 ACCPTED 2022-01-07 17:44:54

solution1
0 2022-01-07 17:26:24

solution2
0 2022-01-07 17:40:36

solution3
0 ACCPTED 2022-01-07 17:44:54