[英]cumsum on subset of pandas df columns
I have a pandas dataframe as follows:我有一个 pandas dataframe 如下:
Date Week Value1 Value2 Value3
2022-01-01 1 -10 20 30
2022-01-02 1 -5 25 20
2022-01-03 1 0 15 NaN
2022-01-04 1 5 7 10
2022-01-05 1 7 10 15
2022-01-06 1 10 5 NaN
I am looking to perform a cumulative sum such that the resulting DF is as follows我正在寻找执行累积和,使得结果 DF 如下
Date Week Value1 Value2 Value3
2022-01-03 1 -15 60 50
2022-01-05 1 22 22 25
Essentially Value3
has NaN
values.本质上Value3
具有NaN
值。 No other column has it.没有其他专栏有它。 I am looking to total up all values for the 3 Value
columns between each NaN
encountered in Value3
.我希望汇总Value3
中遇到的每个NaN
之间的 3 个Value
列的所有值。 I am also looking to keep Date
and Week
of the row where I encountered the NaN
value as is (so cumsum is applied only to Value columns) I have tried so far (some variations of the below) but w/o success.我还希望保持遇到NaN
值的行的Date
和Week
原样(因此 cumsum 仅适用于值列)我到目前为止尝试过(以下的一些变体)但没有成功。
df.groupby(['Date','Week'])['Value1', 'Value2','Value3'].apply(lambda x: x.isna().cumsum().reset_index(drop=True))
But havent got the desired result using this.但是使用它还没有得到想要的结果。 Any ideas on how this can be achieved?关于如何实现这一点的任何想法? Thanks!谢谢!
We use a greoupby on a cumulative number of NaNs in Value3:我们对 Value3 中 NaN 的累积数量使用 greoupby:
df.groupby(df['Value3'].shift().isna().cumsum()).agg({'Date':'last', 'Week':'last', 'Value1':'sum', 'Value2':'sum', 'Value3':'sum'}).reset_index(drop = True)
output: output:
Date Week Value1 Value2 Value3
0 2022-01-03 1 -15 60 50.0
1 2022-01-06 1 22 22 25.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.