Pandas 数据帧中任意两连续行之间差异的平均值

Question

I have a dataframe我有一个数据框

name   date         quantity
'A'    2016-12-02   20
'A'    2016-12-04   5
'A'    2016-11-30   10
'B'    2016-11-30   10
...

What I want to do is calculate, for any pair of consecutive dates (consecutive as in chronological) for a name, the difference in the quantity, and the average these counts for a name.我想要做的是计算名称的任何连续日期（按时间顺序连续）的数量差异以及名称的平均值。

Dates are indeed not necessarily presented in a chronological ordering.日期确实不一定按时间顺序显示。

Specifically, for name A I'd want to compute +10 (difference 2nd Dec - 30 Nov) and -15 (difference 4th Dec -2nd Nov) and then average those, to get a final result of -2.5 for this name.具体来说，对于名称A我想计算 +10（12 月 2 日 - 11 月 30 日的差）和 -15（12 月 4 日 - 11 月 2 日的差），然后对这些进行平均，以获得该名称的最终结果 -2.5。

Ideas?想法？

Answer 1

You can use groupby and apply diff with mean :您可以使用groupby并使用mean apply diff ：

print (df.groupby('name')['quantity'].apply(lambda x: x.diff().mean()).reset_index())
  name  quantity
0  'A'      -2.5
1  'B'       NaN

EDIT: You can add sort_values by column date编辑：您可以按列date添加sort_values

print (df.sort_values('date')
         .groupby('name')['quantity']
         .apply(lambda x: x.diff().mean())
         .reset_index())
  name  quantity
0  'A'      -2.5
1  'B'       NaN

Pandas 数据帧中任意两连续行之间差异的平均值

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-12-08 11:32:33

Pandas 数据帧中任意两连续行之间差异的平均值

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-12-08 11:32:33

解决方案1
3 已采纳 2016-12-08 11:32:33