简体   繁体   English

Pandas 中的自定义累积和

[英]Custom Cumulative Sum in Pandas

Noob question, apologies.菜鸟问题,抱歉。

I am trying to do a cumulative sum on a table I have imported.我正在尝试对导入的表进行累计总和。 However I wish for it to perform slightly differently at a mid point in the column before continuing.但是,在继续之前,我希望它在列的中点处的表现略有不同。 Is there a way to get a cumsum() to calculate to a row then continue from a further point有没有办法让 cumsum() 计算到一行然后从另一个点继续

df['Cumlative Sum'] = df['Value'].cumsum()

|    | Value | Cumlative Sum | Expected Cumlative Sum |
|----|-------|---------------|------------------------|
| 0  | 329.6 | 329.6         | 329.6                  |
| 1  | 34.0  | 363.6         | 363.6                  |
| 2  | 10    | 373.6         | 373.6                  |
| 3  | 8     | 381.6         | 381.6                  |
| 4  | 3     | 384.6         | 384.6                  |
| 5  | -2    | 382.6         | 382.6                  |
| 6  | -4    | 378.6         | 378.6                  |
| 7  | -34   | 344.6         | 344.6                  |
| 8  | -1    | 343.6         | 343.6                  |
| 9  | 343.6 | 687.2         | 343.6                  |
| 10 | 0     | 687.2         | 343.6                  |
| 11 | -33   | 654.2         | 310.6                  |
| 12 | -3    | 651.2         | 307.6                  |
| 13 | 0     | 651.2         | 307.6                  |
| 14 | 1     | 652.2         | 308.6                  |
| 15 | 4     | 656.2         | 312.6                  |
| 16 | 0     | 656.2         | 312.6                  |
| 17 | 21    | 677.2         | 333.6                  |
| 18 | 333.6 | 1010.8        | 333.6                  |

You can get started with something like this ..你可以从这样的事情开始..

import pandas as pd
import numpy as np

df = pd.DataFrame(data=np.random.randint(0,100,size=(20,2)),columns=['A','B'])

def Offset_CumSum(Column, Percentage_Offset=0.5):
    return np.cumsum(Column[int(len(Column)*Percentage_Offset):])

Cumsum_DF = df.apply(lambda x: Offset_CumSum(x), axis=0)
print(df)
print(Cumsum_DF)

This produces the following output.这会产生以下输出。

     A   B
0   29  11
1    9  51
2   99  31
3   30  44
4   76  13
5   32  48
6   85  83
7    9  98
8   49  34
9   25   0
10  39  22
11  25  96
12  69   7
13  28   6
14   4  92
15  90  32
16  68  72
17  63  25
18  85  47
19  61  31
      A    B
10   39   22
11   64  118
12  133  125
13  161  131
14  165  223
15  255  255
16  323  327
17  386  352
18  471  399
19  532  430

===================================================================== ================================================== ====================

Adding a question dataset specific code after seeing the edit.查看编辑后添加问题数据集特定代码。

import pandas as pd
import numpy as np

df = pd.DataFrame(data=np.random.randint(0,100,size=(20,2)),columns=['A','B'])
def Offset_CumSum(Column, Percentage_Offset=0.5):
    return np.cumsum(Column[: int(len(Column)*Percentage_Offset)]).tolist() + np.cumsum(Column[int(len(Column)*Percentage_Offset):]).tolist()

Cumsum_DF = df.apply(lambda x: Offset_CumSum(x), axis=0)
print(df)
print(Cumsum_DF)

This should work.这应该有效。

df['Group Flag'] = ""
df.loc[0:8, 'Group Flag'] = 0
df.loc[9:17, 'Group Flag'] = 1
df['Cumlative Sum'] = df.groupby('Group Flag')['Value'].cumsum()
df.drop('Group Flag', axis=1)
df[['Title','Value','Cumlative Sum']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM