简体   繁体   English

Pandas 石斑鱼“累积”总和()

[英]Pandas Grouper "Cumulative" sum()

I'm trying to calculate the cumulative total for the next 4 weeks.我正在尝试计算接下来 4 周的累计总数。

Here is an example of my data frame这是我的数据框的示例

d = {'account': [10, 10, 10, 10, 10, 10, 10, 10],
     'volume': [25, 60, 40, 100, 50, 100, 40, 50]}
df = pd.DataFrame(d)
df['week_starting'] = pd.date_range('05/02/2021',
                                    periods=8,
                                    freq='W')
df['volume_next_4_weeks'] = [225, 250, 290, 290, 240, 190, 90, 50]
df['volume_next_4_weeks_cumulative'] = ['(25+60+40+100)', '(60+40+100+50)', '(40+100+50+100)', '(100+50+100+40)', '(50+100+40+50)', '(100+40+50)', '(40+50)', '(50)']
df.head(10)

dataframe_table_view dataframe_table_view

I would to find a way to calculate the cumulative amount by pd.Grouper freq = 4W.我想找到一种方法来计算 pd.Grouper freq = 4W 的累积量。

This should work:这应该有效:

df['volume_next_4_weeks']  = [sum(df['volume'][i:i+4]) for i in range(len(df))]

For the other column showing the addition as string , I have stored the values in a list using the same logic above but not applying sum and then joining the list elements as string :对于显示添加为string的另一列,我使用上面相同的逻辑将值存储在列表中,但不应用 sum 然后将列表元素作为string加入:

df['volume_next_4_weeks_cumulative'] = [df['volume'][i:i+4].to_list() for i in range(len(df))]
df['volume_next_4_weeks_cumulative'] = df['volume_next_4_weeks_cumulative'].apply(lambda row: ' + '.join(str(x) for x in row))

Now as you mentioned you have different multiple accounts and you want to do it separately for all of them, create a custom function and then use groupby and apply to create the columns:现在,正如您所提到的,您有不同的多个帐户,并且您想为所有帐户分别创建一个自定义 function,然后使用groupbyapply创建列:

def create_mov_cols(df):
    df['volume_next_4_weeks']  = [sum(df['volume'][i:i+4]) for i in range(len(df))]
    df['volume_next_4_weeks_cumulative'] = [df['volume'][i:i+4].to_list() for i in range(len(df))]
    df['volume_next_4_weeks_cumulative'] = df['volume_next_4_weeks_cumulative'].apply(lambda row: ' + '.join(str(x) for x in row))
    return df

Apply the function to the DataFrame:将 function 应用于 DataFrame:

df = df.groupby(['account']).apply(create_mov_cols)
df['volume_next_4_weeks'] = df[['week_starting', 'volume']][::-1].rolling(window='28D', on='week_starting').sum()[::-1]['volume']

28D is used instead of 4W since latter is not a fixed frequency.使用28D代替4W ,因为后者不是固定频率。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM