Time weighted moving average in pandas

Question

I would like to perform a time-weighted moving average in pandas that weights proportionately to how recent the observations were.

Here is some sample data that I have.

dates = ['01/01/2021','02/01/2021','03/01/2021','04/01/2021','05/01/2021','06/01/2021']
swimmer1_place = ['1','1','4','3',np.nan,np.nan,]
swimmer2_place = [np.nan,'3','1',np.nan,'4','2']
swimmer3_place = ['2','2','3',np.nan,'3','1']

df = pd.DataFrame({'date':dates,'swimmer_1_place':swimmer1_place,'swimmer_2_place':swimmer2_place,'swimmer_3_place':swimmer3_place})
df['date'] = pd.to_datetime(df['date'])

What would be the best way to go about this? I have tried using the built-in Pandas EWM method with limited success because that doesn't consider the varying time intervals between the different swimmers.

Answer 1

After converting your col values into integers:

def convert(x):
    if x is np.nan:
        return np.nan
    else: 
        return (int(x))
swimmer1_place=[convert(a) for a in swimmer1_place]
swimmer2_place=[convert(a) for a in swimmer2_place]
swimmer3_place=[convert(a) for a in swimmer3_place]

You can get a weighted average like the following (Of course, there might be other possible ways to do this):

for column in df.columns:
    if column!="date":
        Array=(base_time-df['date'])[df[column].notna()].values
        SUM=np.sum(Array)
        AVG_array=Array/SUM
        column_vals=df[column][df[column].notna()].values
        result=AVG_array*column_vals
        df[column].loc[df[column].notna()]=result

Time weighted moving average in pandas

Question

1 answers

solution1
0 2021-08-22 10:46:47

Time weighted moving average in pandas

Question

1 answers

solution1 0 2021-08-22 10:46:47

solution1
0 2021-08-22 10:46:47