[英]Calculate mean and standard deviation in a time-series
I have the following dataframe:我有以下 dataframe:
COD ACT DATE
0 5713 1.0 2020-07-16
1 5713 1.0 2020-08-11
2 5713 1.0 2020-06-20
3 5713 1.0 2020-06-19
4 5713 1.0 2020-06-01
5 23369 1.0 2020-07-17
6 23369 1.0 2020-08-07
7 23369 1.0 2020-09-02
8 23369 1.0 2020-11-22
9 32012 1.0 2020-06-02
10 32012 1.0 2020-07-26
I want to calculate the mean and standard deviation of each COD on the whole time series.我想计算整个时间序列中每个 COD 的平均值和标准差。 Previously I was calculating like this:
以前我是这样计算的:
df['MEAN'] = df.groupby("COD")["ACT"].transform("mean")
df['STD'] = df.groupby("COD")["ACT"].transform("std")
But this calculated the mean for the time span of the initial timestamp for ACT and final timestamp for ACT (like 3 ACT within 5 months - not 8 months).但这计算了 ACT 的初始时间戳和 ACT 的最终时间戳的时间跨度的平均值(例如 5 个月内的 3 次 ACT - 而不是 8 个月)。 ACT is the timestamp for the activity, but the whole timeseries has 8 months.
ACT 是活动的时间戳,但整个时间序列有 8 个月。 I want to calculate the mean and standard deviation for the whole 8 months.
我想计算整个 8 个月的平均值和标准差。 Can anyone help me?
谁能帮我?
What you're looking for is an apply function on the groupby.您正在寻找的是在 groupby 上申请 function。 Make sure to convert the
DATE
column to a datetime
object.确保将
DATE
列转换为datetime
时间 object。
df.groupby("COD").apply(lambda x: x["ACT"].mean())
Here is a screenshot for more clarity.这是一个更清晰的屏幕截图。 I also thought it might help to get a month wise sum and mean analysis for every
COD
.我还认为对每个
COD
进行一个月的明智总和和平均值分析可能会有所帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.