简体   繁体   English

在 Pandas 上使用变化的 Window 创建滚动平均值

[英]Creating a Rolling Mean with a Changing Window on Pandas

I have a Pandas dataframe as shown below.我有一个 Pandas dataframe 如下图所示。 I am looking to creating a 7 day rolling mean for the temperature.我希望为温度创建一个 7 天滚动平均值。 I understand how to do this if it was one reading per day ( dataset['rolling_temp'] = dataset.iloc[:,3].rolling(window=7).mean() ) but the problem involves having a random number of readings per day.如果每天阅读一次( dataset['rolling_temp'] = dataset.iloc[:,3].rolling(window=7).mean() ),我了解如何执行此操作,但问题涉及随机数每天的读数。 ie 1 day may be multiple rows.即 1 天可能是多行。
Any help would be much appreciated!任何帮助将非常感激!

    day   temperature 
1     1          18.0           
2     1          19.0
3     2          18.0
4     3          17.0
5     4          18.5 
6     4          19.0
7     5          18.0
8     6          19.0
9     7          18.5
10    8          17.5
11    9          17.0
12   10          18.0
13   11          19.0
14   12          19.5
15   13          16.5
16   13          17.0

How about doing a .groupby first and then doing .rolling ?先做一个.groupby然后做.rolling怎么样? That solves the problem of having multiple days and gives you one value per day.这解决了多天的问题,并每天为您提供一个价值。

dataset = dataset.groupby('day')['temperature'].mean().reset_index().iloc[:,3].rolling(window=7).mean()

You should be able to produce rolling stats if you convert your days into proper dates and make the index out of them.如果您将您的日期转换为正确的日期并从中制作索引,您应该能够生成滚动统计信息。 You will have to include months and years, so add extra columns if you do not store such values already, and then:您将必须包括月份和年份,因此如果您尚未存储此类值,请添加额外的列,然后:

dataset['date'] = dataset[['year', 'month', 'day']].apply(lambda row: '{}-{}-{}'.format(row['year'], row['month'], row['day']), axis=1)
dataset.set_index('date', inplace=True')
dataset.temperature.rolling('7D', min_periods=1).mean()

See for your reference at the bottom of this page .请参阅本页底部的参考资料。 You can also try to resample the index:您还可以尝试重新采样索引:

dataset.temperature.resample('D').rolling('7D', min_periods=1).mean()

Note that this may not work with older versions of pandas, so if you run into errors consider upgrading to the latest stable.请注意,这可能不适用于旧版本的 pandas,因此如果您遇到错误,请考虑升级到最新的稳定版。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM