Pandas来自groupby的累积差异

Question

I need to calculate the difference from the beginning of a MultiIndex level, to calculate the decay from start of a level. 我需要计算从MultiIndex级别开始的差异，以计算从级别开始的衰减。 My example input and output will look something like this: 我的示例输入和输出将如下所示：

               values
place time     
A     a           120
      b           100
      c            90
      d            50
B     e            11
      f            12
      g            10
      h             9

               values

A     a           NaN
      b           -20
      c           -30
      d           -70
B     e           Nan
      f            +1
      g            -1
      h            -2

I can use a grouby to get the diff between consecutive cells in a level: 我可以使用grouby来获取级别中连续单元格之间的差异：

df.groupby(level=0)['values'].diff()

But that's not quite what I want! 但这不是我想要的！

Alas, the accepted answer isn't quite what I want. 唉，接受的答案并不是我想要的。 I've got a better example: 我有一个更好的例子：

arrays = [np.array(['bar', 'bar', 'bar', 'foo', 'foo', 'foo']),
          np.array(['one', 'two', 'three', 'one', 'two', 'three'])]
df = pd.DataFrame([1000, 800, 500, 800, 400, 200], index=arrays)

   bar one    1000
       two     800
       three   500
   foo one     800
       two     400
       three   200

    expected_result = pd.DataFrame([Nan, -200, -500, Nan, -400, -600], index=arrays)

   bar one      Nan
       two     -200
       three   -500
   foo one     Nan 
       two     -400
       three   -600

But the result of df.groupby(level=0).diff().cumsum() gives: 但是df.groupby(level=0).diff().cumsum()给出：

pd.DataFrame([Nan, -200, -500, Nan, -900, -1100], index=arrays)

   bar one      Nan
       two     -200
       three   -500
   foo one      Nan 
       two     -900
       three   -1100

Answer 1

你在寻找一个cumsum吗？

df.groupby(level=0)['values'].diff().cumsum()

Answer 2

You can get what I wanted by chaining another groupby : 你可以得到我想要通过链接另一groupby ：

arrays = [np.array(['bar', 'bar', 'bar', 'foo', 'foo', 'foo']),
      np.array(['one', 'two', 'three', 'one', 'two', 'three'])]
df = pd.DataFrame([1000, 800, 500, 800, 400, 200], index=arrays)

   bar one    1000
       two     800
       three   500
   foo one     800
       two     400
       three   200

    expected_result = pd.DataFrame([Nan, -200, -500, Nan, -400, -600], index=arrays)

df.groupby(level=0).diff().groupby(level=0).cumsum()

    bar one      Nan
       two     -200
       three   -500
    foo one     Nan 
       two     -400
       three   -600

Pandas来自groupby的累积差异

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-05-02 16:18:47

解决方案2
1 2018-05-03 09:38:48

Pandas来自groupby的累积差异

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-05-02 16:18:47

解决方案2 1 2018-05-03 09:38:48

解决方案1
3 已采纳 2018-05-02 16:18:47

解决方案2
1 2018-05-03 09:38:48