简体   繁体   English

Pandas 数据帧中任意两连续行之间差异的平均值

[英]Pandas average of the difference between any two consecutive rows in dataframe

I have a dataframe我有一个数据框

name   date         quantity
'A'    2016-12-02   20
'A'    2016-12-04   5
'A'    2016-11-30   10
'B'    2016-11-30   10
...

What I want to do is calculate, for any pair of consecutive dates (consecutive as in chronological) for a name, the difference in the quantity, and the average these counts for a name.我想要做的是计算名称的任何连续日期(按时间顺序连续)的数量差异以及名称的平均值。

Dates are indeed not necessarily presented in a chronological ordering.日期确实不一定按时间顺序显示。

Specifically, for name A I'd want to compute +10 (difference 2nd Dec - 30 Nov) and -15 (difference 4th Dec -2nd Nov) and then average those, to get a final result of -2.5 for this name.具体来说,对于名称A我想计算 +10(12 月 2 日 - 11 月 30 日的差)和 -15(12 月 4 日 - 11 月 2 日的差),然后对这些进行平均,以获得该名称的最终结果 -2.5。

Ideas?想法?

You can use groupby and apply diff with mean :您可以使用groupby并使用mean apply diff

print (df.groupby('name')['quantity'].apply(lambda x: x.diff().mean()).reset_index())
  name  quantity
0  'A'      -2.5
1  'B'       NaN

EDIT: You can add sort_values by column date编辑:您可以按列date添加sort_values

print (df.sort_values('date')
         .groupby('name')['quantity']
         .apply(lambda x: x.diff().mean())
         .reset_index())
  name  quantity
0  'A'      -2.5
1  'B'       NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM