简体   繁体   中英

Pandas groupby cumulative sum and month

I need to sort my df by month with the cumulative sum for each user (about 5 or 6). Each line is a different log entry by the user, so users may have multiple entries on the same day.

        USER        DATE
1      user1  2021-04-15
4      user5  2021-04-15
5      user3  2021-04-15
6      user1  2021-04-15
14     user2  2021-04-16
...      ...         ...
2227   user4  2021-12-30
2228   user5  2021-12-30
2229   user3  2021-12-30
2230   user2  2021-12-30
2231   user1  2021-12-30

I would like to get something like this

         MONTH    USER  CUMSUM
1      2021-04   user1       3
2      2021-04   user2       5
3      2021-04   user3       2
4      2021-04   user4       0
5      2021-04   user5       1
...        ...     ...     ...
n      2021-12   user1     232
n+1    2021-12   user2     124
n+2    2021-12   user3     152
n+3    2021-12   user4     312
n+4    2021-12   user5     218

The objective is to later graph the cumulative sum by month for each user. I have a code that is already working but had to iterate on the df and count each entry for each month on a dict. Probably not the most efficient way. I tried using cumsum and groupby but so far without success.

You can use pandas Grouper or more typically written pd.Grouper for the month, but you have to set an index if you don't already have one.

df.set_index('DATE').groupby([pd.Grouper(freq = 'M'),'USER']).sum()
df['MONTH'] = pd.to_datetime(df['DATE'], format='%Y-%m-%d') # getting DATE to datetime
df['MONTH'] = df['MONTH'].apply(lambda x: x.strftime("%Y-%m")) # applying your format
df['count'] = 1 # adding a count column for cumsum()
df_try = df.groupby(['USER', 'MONTH']).sum().groupby(level=0).cumsum() # groupby and cumsum

how about this one-liner:

df.groupby([pd.Grouper(key='DATE', freq='M'), 'USER'])['USER'].count().groupby(['USER']).cumsum()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM