简体   繁体   中英

Pandas: Calculate the average over all of the columns for n rolling rows at a time

What I am trying to do is this... I have time series and I want to calculate rolling average, for n rows across multiple columns. What I did initially was to make another column that would contain average for each row and then do your standard rolling average for n rows. However, when I don't have values in some of the columns that throws off my calculations.

Example:

Col1 | Col2 | Col3 | Avg
10   | 20   |      | 15
     | 10   |      | 10
10   | 15   |  20  | 15

Rolling average of Avg: 13.33

While it should be: 14.16

Here is the example that worked for me that has all the numbers...

Col1 | Col2 | Col3 | Avg 
10   | 20   |   15 | 15
10   | 10   |   10 | 10
10   | 15   |   20 | 15

Rolling average of Avg: 13.33

While it should be: 13.33

What I can do is a manual loop... I also can add second column that would contain number of elements in each row.

But is there a better way to do it?

np.nanmean will average everything in a multi-dimensional array.

np.nanmean(df.values)

14.166666666666666

Using this in a rolling 3 periods fashion, you could do this

pd.Series({df.index[i]: np.nanmean(df.iloc[i-2:i+1].values) for i in range(2, len(df))})

2    14.166667
dtype: float64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM