[英]Pandas average of the difference between any two consecutive rows in dataframe
I have a dataframe我有一个数据框
name date quantity
'A' 2016-12-02 20
'A' 2016-12-04 5
'A' 2016-11-30 10
'B' 2016-11-30 10
...
What I want to do is calculate, for any pair of consecutive dates (consecutive as in chronological) for a name, the difference in the quantity, and the average these counts for a name.我想要做的是计算名称的任何连续日期(按时间顺序连续)的数量差异以及名称的平均值。
Dates are indeed not necessarily presented in a chronological ordering.日期确实不一定按时间顺序显示。
Specifically, for name A
I'd want to compute +10 (difference 2nd Dec - 30 Nov) and -15 (difference 4th Dec -2nd Nov) and then average those, to get a final result of -2.5 for this name.具体来说,对于名称
A
我想计算 +10(12 月 2 日 - 11 月 30 日的差)和 -15(12 月 4 日 - 11 月 2 日的差),然后对这些进行平均,以获得该名称的最终结果 -2.5。
Ideas?想法?
You can use groupby
and apply
diff
with mean
:您可以使用
groupby
并使用mean
apply
diff
:
print (df.groupby('name')['quantity'].apply(lambda x: x.diff().mean()).reset_index())
name quantity
0 'A' -2.5
1 'B' NaN
EDIT: You can add sort_values
by column date
编辑:您可以按列
date
添加sort_values
print (df.sort_values('date')
.groupby('name')['quantity']
.apply(lambda x: x.diff().mean())
.reset_index())
name quantity
0 'A' -2.5
1 'B' NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.