简体   繁体   English

Python:具有更改键的累积和

[英]Python: Cumulative Sum with changing key

I have a table of data such as: 我有一个数据表,例如:

F(1)  F(2)  F(3)  Amount
A     B     C     100
A     B     C     100
A     B     C     100
D     E     F     300
D     E     F     150
G     H     I     100
G     H     I     200

I would like to produce a new column, showing the cumulative sum of field 'Amount', but one that resets to 0 whenever the key of columns F(1), F(2) and F(3) change. 我想产生一个新列,显示字段“金额”的累积总和,但是只要列F(1),F(2)和F(3)的键更改,该列就会重置为0。

ie I would like to create the following output (sans dotted lines!) 即我想创建以下输出(无虚线!)

F(1)  F(2)  F(3)  Amount  CumSum
A     B     C     100     100
A     B     C     100     200
A     B     C     100     300
------------------------------ resets to zero as key changes
D     E     F     300     300
D     E     F     150     450
------------------------------ resets to zero as key changes
G     H     I     100     100
G     H     I     200     300

I have potentially up to a million rows in this table so I am looking for a robust implementation. 该表中可能有多达一百万行,因此我正在寻找可靠的实现。 Is pandas the way forward here? 熊猫在这里是前进的方向吗? I have not used pandas before but am happy to explore. 我以前没有用过熊猫,但很高兴探索。

group by your key columns and call cumsum: 按您的关键列分组并致电cumsum:

df['CumSum'] = df.groupby(['F(1)', 'F(2)', 'F(3)'])['Amount'].cumsum()

df
Out: 
  F(1) F(2) F(3)  Amount  CumSum
0    A    B    C     100     100
1    A    B    C     100     200
2    A    B    C     100     300
3    D    E    F     300     300
4    D    E    F     150     450
5    G    H    I     100     100
6    G    H    I     200     300

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM