[英]Pandas cumulative diff from groupby
I need to calculate the difference from the beginning of a MultiIndex level, to calculate the decay from start of a level. 我需要计算从MultiIndex级别开始的差异,以计算从级别开始的衰减。 My example input and output will look something like this:
我的示例输入和输出将如下所示:
values
place time
A a 120
b 100
c 90
d 50
B e 11
f 12
g 10
h 9
values
A a NaN
b -20
c -30
d -70
B e Nan
f +1
g -1
h -2
I can use a grouby to get the diff between consecutive cells in a level: 我可以使用grouby来获取级别中连续单元格之间的差异:
df.groupby(level=0)['values'].diff()
But that's not quite what I want! 但这不是我想要的!
Alas, the accepted answer isn't quite what I want. 唉,接受的答案并不是我想要的。 I've got a better example:
我有一个更好的例子:
arrays = [np.array(['bar', 'bar', 'bar', 'foo', 'foo', 'foo']),
np.array(['one', 'two', 'three', 'one', 'two', 'three'])]
df = pd.DataFrame([1000, 800, 500, 800, 400, 200], index=arrays)
bar one 1000
two 800
three 500
foo one 800
two 400
three 200
expected_result = pd.DataFrame([Nan, -200, -500, Nan, -400, -600], index=arrays)
bar one Nan
two -200
three -500
foo one Nan
two -400
three -600
But the result of df.groupby(level=0).diff().cumsum()
gives: 但是
df.groupby(level=0).diff().cumsum()
给出:
pd.DataFrame([Nan, -200, -500, Nan, -900, -1100], index=arrays)
bar one Nan
two -200
three -500
foo one Nan
two -900
three -1100
你在寻找一个cumsum
吗?
df.groupby(level=0)['values'].diff().cumsum()
You can get what I wanted by chaining another groupby
: 你可以得到我想要通过链接另一
groupby
:
arrays = [np.array(['bar', 'bar', 'bar', 'foo', 'foo', 'foo']),
np.array(['one', 'two', 'three', 'one', 'two', 'three'])]
df = pd.DataFrame([1000, 800, 500, 800, 400, 200], index=arrays)
bar one 1000
two 800
three 500
foo one 800
two 400
three 200
expected_result = pd.DataFrame([Nan, -200, -500, Nan, -400, -600], index=arrays)
df.groupby(level=0).diff().groupby(level=0).cumsum()
bar one Nan
two -200
three -500
foo one Nan
two -400
three -600
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.