如何根据日期条件在熊猫的列中找到前n行的平均值？

Question

I have a dataset that looks like this: 我有一个看起来像这样的数据集：

value1 value2 value3 date

17    21    22     2005-04-01 12:05:00

19    20    24     2005-04-01 12:06:00

16    26    23     2005-04-01 12:07:00

I need to transform it somehow, so the values of each row with date ending with .05:00 (5th minute of each hour) will be equal to average value of previous 60 rows. 我需要对其进行某种形式的转换，因此日期以.05：00结尾的每一行的值（每小时5分钟）将等于前60行的平均值。

I tried to use groupby based on datetime, it does provide average values for each hour (00 - 59), but i need to adjust it for my case. 我尝试根据日期时间使用groupby，它确实提供了每小时（00-59）的平均值，但是我需要针对我的情况进行调整。

In the end I would like to have something like this: 最后，我想拥有这样的东西：

  value1 value2 value3 date

  17    21    22     2005-04-01 12:05:00

  19    20    24     2005-04-01 13:05:00

  16    26    23     2005-04-01 14:05:00

where 17 for instance is average of 60 previous values in value1 column. 例如，其中17是value1列中60个先前值的平均值。

Answer 1

This will create a rolling mean on 60 minutes windows (makes sure, that date column is datetime64[ns] dtype, if not, convert it beforehand), then you can select the necessary rows with .loc[] : 这将在60分钟的窗口上产生滚动平均值（请确保该date列为datetime64[ns] dtype，如果不是，请事先进行转换），然后可以使用.loc[]选择必要的行：

df.rolling('H', on='date').mean().loc[lambda x: x['date'].dt.minute == 5]

See the docs for further details on .rolling() and .loc[] . 有关.rolling()和.loc[]更多详细信息，请参阅文档。

如何根据日期条件在熊猫的列中找到前n行的平均值？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-04-27 11:38:39

如何根据日期条件在熊猫的列中找到前n行的平均值？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-04-27 11:38:39

解决方案1
0 已采纳 2019-04-27 11:38:39