pandas calculate rolling_std of top N dataframe rows

I have a dataframe like this:

date      A
2015.1.1  10
2015.1.2  20
2015.1.3  30
2015.1.4  40
2015.1.5  50
2015.1.6  60

I need to caculate the std of top N rows, such as:

date      A  std
2015.1.1  10  std(10)
2015.1.2  20  std(10,20)
2015.1.3  30  std(10,20,30)
2015.1.4  40  std(10,20,30,40)
2015.1.5  50  std(10,20,30,40,50)
2015.1.6  60  std(10,20,30,40,50,60)

pd.rolling_std is used to do this, however, how to change N dynamically?

df[['A']].apply(lambda x:pd.rolling_std(x,N))

It could be done by calling apply on the df like so:

In [29]:
def func(x):
    return df.iloc[:x.name + 1][x.index].std()
df['std'] = df[['A']].apply(func, axis=1)
       date   A        std
0  2015.1.1  10        NaN
1  2015.1.2  20   7.071068
2  2015.1.3  30  10.000000
3  2015.1.4  40  12.909944
4  2015.1.5  50  15.811388
5  2015.1.6  60  18.708287

This uses double subscripts [[]] to call apply on a df with a single column, this allows you to pass param axis=1 so you can call you function row-wise, you then have access to the index attribute, which is name and the column name attribute, which is index , this allows you to slice your df to calculate a rolling std .

You can add a window arg to func to modify the window as desired


It looks like your index is a str, the following should work:

In [39]:
def func(x):
    return df.ix[:x.name ][x.index].std()
df['std'] = df[['A']].apply(lambda x: func(x), axis=1)

           A        std
2015.1.1  10        NaN
2015.1.2  20   7.071068
2015.1.3  30  10.000000
2015.1.4  40  12.909944
2015.1.5  50  15.811388
2015.1.6  60  18.708287

