简体   繁体   English

熊猫:一次计算n个滚动行的所有列的平均值

[英]Pandas: Calculate the average over all of the columns for n rolling rows at a time

What I am trying to do is this... I have time series and I want to calculate rolling average, for n rows across multiple columns. 我正在尝试做的是...我有时间序列,我想计算多列中n行的滚动平均值。 What I did initially was to make another column that would contain average for each row and then do your standard rolling average for n rows. 我最初要做的是制作另一列,每列包含平均值,然后对n行进行标准滚动平均值计算。 However, when I don't have values in some of the columns that throws off my calculations. 但是,当我在某些列中没有值时,就无法进行计算。

Example: 例:

Col1 | Col2 | Col3 | Avg
10   | 20   |      | 15
     | 10   |      | 10
10   | 15   |  20  | 15

Rolling average of Avg: 13.33 平均滚动平均值: 13.33

While it should be: 14.16 应该是: 14.16

Here is the example that worked for me that has all the numbers... 这是为我工作的示例,其中包含所有数字...

Col1 | Col2 | Col3 | Avg 
10   | 20   |   15 | 15
10   | 10   |   10 | 10
10   | 15   |   20 | 15

Rolling average of Avg: 13.33 平均滚动平均值: 13.33

While it should be: 13.33 应该是: 13.33

What I can do is a manual loop... I also can add second column that would contain number of elements in each row. 我可以做的是手动循环...我还可以添加第二列,该列将在每行中包含元素数量。

But is there a better way to do it? 但是有更好的方法吗?

np.nanmean will average everything in a multi-dimensional array. np.nanmean将对多维数组中的所有内容np.nanmean平均。

np.nanmean(df.values)

14.166666666666666

Using this in a rolling 3 periods fashion, you could do this 滚动3个周期使用此方法,您可以执行此操作

pd.Series({df.index[i]: np.nanmean(df.iloc[i-2:i+1].values) for i in range(2, len(df))})

2    14.166667
dtype: float64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM