Pandas: Calculate the average over all of the columns for n rolling rows at a time

Question

What I am trying to do is this... I have time series and I want to calculate rolling average, for n rows across multiple columns. What I did initially was to make another column that would contain average for each row and then do your standard rolling average for n rows. However, when I don't have values in some of the columns that throws off my calculations.

Example:

Col1 | Col2 | Col3 | Avg
10   | 20   |      | 15
     | 10   |      | 10
10   | 15   |  20  | 15

Rolling average of Avg: 13.33

While it should be: 14.16

Here is the example that worked for me that has all the numbers...

Col1 | Col2 | Col3 | Avg 
10   | 20   |   15 | 15
10   | 10   |   10 | 10
10   | 15   |   20 | 15

Rolling average of Avg: 13.33

While it should be: 13.33

What I can do is a manual loop... I also can add second column that would contain number of elements in each row.

But is there a better way to do it?

Answer 1

np.nanmean will average everything in a multi-dimensional array.

np.nanmean(df.values)

14.166666666666666

Using this in a rolling 3 periods fashion, you could do this

pd.Series({df.index[i]: np.nanmean(df.iloc[i-2:i+1].values) for i in range(2, len(df))})

2    14.166667
dtype: float64

Pandas: Calculate the average over all of the columns for n rolling rows at a time

Question

1 answers

solution1
0 2017-02-02 19:03:16

Pandas: Calculate the average over all of the columns for n rolling rows at a time

Question

1 answers

solution1 0 2017-02-02 19:03:16

solution1
0 2017-02-02 19:03:16