[英]Pandas groupby cumulative sum ignore current row
I know there's some questions about this topic (like Pandas: Cumulative sum of one column based on value of another ) however, none of them fuull fill my requirements.我知道关于这个主题有一些问题(比如Pandas: Cumulative sum of one column based on value of another )但是,它们都不能完全满足我的要求。
Let's say I have a dataframe like this one假设我有一个像这样的数据框
I want to compute the cumulative sum of Cost grouping by month, avoiding taking into account the current value, in order to get the Desired column.By using groupby
and cumsum
I obtain colum CumSum我想按月计算 Cost 分组的累积总和,避免考虑当前值,以获得 Desired 列。通过使用
groupby
和cumsum
我获得 colum CumSum
The DDL to generate the dataframe is生成数据帧的 DDL 是
df = pd.DataFrame({'Month': [1,1,1,2,2,1,3],
'Cost': [5,8,10,1,3,4,1]})
IIUC you can use groupby.cumsum
and then just subtract cost
; IIUC 你可以使用
groupby.cumsum
然后减去cost
;
df['cumsum_'] = df.groupby('Month').Cost.cumsum().sub(df.Cost)
print(df)
Month Cost cumsum_
0 1 5 0
1 1 8 5
2 1 10 13
3 2 1 0
4 2 3 1
5 1 4 23
6 3 1 0
You can do the following:您可以执行以下操作:
df['agg']=df.groupby('Month')['Cost'].shift().fillna(0)
df['Cumsum']=df['Cost']+df['agg']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.