Vectorised solution to fill rows of a column based on value from previous row in Python

Question

Dataframe:

df = pd.DataFrame({"X":[1,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan]})

print(df)
     X
0  1.0
1  NaN
2  NaN
3  NaN
4  NaN
5  NaN

I want to populate np.nan by taking squares of values from previous row and adding one to it.

Desired output:

         X
0       1.0
1       2.0
2       5.0
3      26.0
4     677.0
5  458330.0

This can be done by a for loop with:

for i in range(1,len(df)):
    df["X"].iloc[i] = ((df["X"].iloc[i-1]) ** 2) + 1

But looking for a vectorised solution of the same problem

Answer 1

Unfortunately vecorized solution is problematic, because is used previous output value. For improve performance is used numba :

@jit(nopython=True)
def f(a):
    for i in range(1, a.shape[0]):
        a[i] = a[i-1] ** 2 + 1
    return a

df['X'] = f(df['X'].to_numpy())
print (df)
                X
0    1.000000e+00
1    2.000000e+00
2    5.000000e+00
3    2.600000e+01
4    6.770000e+02
5    4.583300e+05
6    2.100664e+11
7    4.412789e+22
8    1.947270e+45
9    3.791862e+90
10  1.437822e+181

Vectorised solution to fill rows of a column based on value from previous row in Python

Question

1 answers

solution1
2 2020-05-25 10:11:49

Vectorised solution to fill rows of a column based on value from previous row in Python

Question

1 answers

solution1 2 2020-05-25 10:11:49

solution1
2 2020-05-25 10:11:49